debian-django-import-export/docs/import_workflow.rst

126 lines
6.1 KiB
ReStructuredText

====================
Import data workflow
====================
This document describes the import data workflow in detail, with hooks that enable
customization of the import process. The central aspect of the import process is a resource's
:meth:`~import_export.resources.Resource.import_data` method which is explained below.
.. function:: import_data(dataset, dry_run=False, raise_errors=False)
The :meth:`~import_export.resources.Resource.import_data` method of
:class:`~import_export.resources.Resource` is responsible for importing data
from a given dataset.
``dataset`` is required and expected to be a :class:`tablib.Dataset` with
a header row.
``dry_run`` is a Boolean which determines if changes to the database are
made or if the import is only simulated. It defaults to ``False``.
``raise_errors`` is a Boolean. If ``True``, import should raise errors.
The default is ``False``, which means that eventual errors and traceback
will be saved in ``Result`` instance.
This is what happens when the method is invoked:
#. First, a new :class:`~import_export.results.Result` instance, which holds
errors and other information gathered during the import, is initialized.
Then, an :class:`~import_export.instance_loaders.InstanceLoader` responsible for loading existing instances
is intitalized. A different :class:`~import_export.instance_loaders.BaseInstanceLoader` can be specified via
:class:`~import_export.resources.ResourceOptions`'s ``instance_loader_class`` attribute.
A :class:`~import_export.instance_loaders.CachedInstanceLoader` can be used to
reduce number of database queries.
See the `source <https://github.com/django-import-export/django-import-export/blob/master/import_export/instance_loaders.py>`_ for available implementations.
#. The :meth:`~import_export.resources.Resource.before_import` hook is called.
By implementing this method in your resource, you can customize the import process.
#. Each row of the to-be-imported dataset is processed according to the following steps:
#. The :meth:`~import_export.resources.Resource.before_import_row` hook is called to allow for row data to be modified before it is imported
#. :meth:`~import_export.resources.Resource.get_or_init_instance` is called
with current :class:`~import_export.instance_loaders.BaseInstanceLoader`
and current row of the dataset, returning an object and a Boolean
declaring if the object is newly created or not.
If no object can be found for the current row,
:meth:`~import_export.resources.Resource.init_instance` is invoked to
initialize an object.
As always, you can override the implementation of
:meth:`~import_export.resources.Resource.init_instance` to customized
how the new object is created (i.e. set default values).
#. :meth:`~import_export.resources.Resource.for_delete` is called to determine if the passed ``instance``
should be deleted. In this case, the import process for the current row is stopped at this point.
#. If the instance was not deleted in the previous step,
:meth:`~import_export.resources.Resource.import_obj` is called with the
``instance`` as current object, ``row`` as current row and ``dry run``.
:meth:`~import_export.resources.Resource.import_field` is called for
each field in :class:`~import_export.resources.Resource` skipping many-
to-many fields. Many-to-many fields are skipped because they require
instances to have a primary key and therefore assignment is postponed to
when the object has already been saved.
:meth:`~import_export.resources.Resource.import_field` in turn calls
:meth:`~import_export.fields.Field.save`, if ``Field.attribute`` is set
and ``Field.column_name`` exists in the given row.
#. It then is determined whether the newly imported object is different
from the already present object and if therefore the given row should be
skipped or not. This is handled by calling
:meth:`~import_export.resources.Resource.skip_row` with ``original`` as
the original object and ``instance`` as the current object from the dataset.
If the current row is to be skipped, ``row_result.import_type`` is set
to ``IMPORT_TYPE_SKIP``.
#. If the current row is not to be skipped,
:meth:`~import_export.resources.Resource.save_instance` is called and
actually saves the instance when ``dry_run`` is not set.
There are two hook methods (that by default do nothing) giving you the option to customize the
import process:
* :meth:`~import_export.resources.Resource.before_save_instance`
* :meth:`~import_export.resources.Resource.after_save_instance`
Both methods receive ``instance`` and ``dry_run`` arguments.
#. :meth:`~import_export.resources.Resource.save_m2m` is called to save
many to many fields.
#. :class:`~import_export.results.RowResult` is assigned with a diff
between the original and the imported object fields, as well as and
``import_type`` attribute which states whether the row is new, updated,
skipped or deleted.
If an exception is raised during row processing and
:meth:`~import_export.resources.Resource.import_data` was invoked with
``raise_errors=False`` (which is the default) the particular traceback
is appended to :class:`~import_export.results.RowResult` as well.
If either the row was not skipped or the
:class:`~import_export.resources.Resource` is configured to report
skipped rows, the :class:`~import_export.results.RowResult` is appended to the :class:`~import_export.results.Result`
#. The :meth:`~import_export.resources.Resource.after_import_row` hook is called
#. The :class:`~import_export.results.Result` is returned.
Transaction support
-------------------
If transaction support is enabled, whole import process is wrapped inside
transaction and rollbacked or committed respectively.
All methods called from inside of ``import_data`` (create / delete / update)
receive ``False`` for ``dry_run`` argument.
.. _Dataset: http://docs.python-tablib.org/en/latest/api/#dataset-object