Add info for building custom converters to docs (issue #109)

2019-06-14 09:07:05 +02:00 · 2019-06-14 09:07:05 +02:00 · 69405e662c
parent 50c99f7b01
commit 69405e662c
5 changed files with 91 additions and 4 deletions
--- a/doc/api.rst
+++ b/doc/api.rst
@ -98,7 +98,7 @@ XSD globals maps API
    :members: copy, register, iter_schemas, iter_globals, clear, build


-.. _xml-schema-converters:
+.. _xml-schema-converters-api:

 XML Schema converters
 ---------------------
--- a/doc/converters.rst
+++ b/doc/converters.rst
@ -0,0 +1,84 @@
+.. _customize-output-data:
+
+Customizing output data with converters
+=======================================
+
+XML data decoding and encoding is handled using an intermediate converter class
+instance that takes charge of composing inner data and mapping of namespaces and prefixes.
+
+Because XML is a structured format that includes data and metadata information,
+as attributes and namespace declarations, is necessary to define conventions for
+naming the different data objects in a distiguishable way. For example a wide-used
+convention is to prefixing attribute names with an '@' character. With this convention
+the attribute `name='John'` is decoded to `'@name': 'John'`, or `'level='10'` is
+decoded to `'@level': 10`.
+
+A related topic is the mapping of namespaces. The expanded namespace representation
+is used within XML objects of the ElementTree library.
+For example `{http://www.w3.org/2001/XMLSchema}string` is the fully qualified name of
+the XSD string type, usually referred as *xs:string* or *xsd:string* with a namespace
+declaration. With string serialization of XML data the names are remapped to prefixed
+format. This mapping is generally useful also if you serialize XML data to another format
+like JSON, because prefixed name is more manageable and readable than expanded format.
+
+
+Available converters
+--------------------
+
+The library includes some converters. The default converter :class:`XMLSchemaConverter`
+is the base class of other converter types. Each derived converter type implements a
+well know convention, related to the conversion from XML to JSON data format:
+
+  * :class:`ParkerConverter`: `Parker convention <https://developer.mozilla.org/en-US/docs/Archive/JXON#The_Parker_Convention>`_
+  * :class:`BadgerFishConverter`: `BadgerFish convention <http://www.sklar.com/badgerfish/>`_
+  * :class:`AbderaConverter`: `Apache Abdera project convention <https://cwiki.apache.org/confluence/display/ABDERA/JSON+Serialization>`_
+  * :class:`JsonMLConverter`: `JsonML (JSON Mark-up Language) convention <http://www.jsonml.org/>`_
+
+A summary of these and other conventions can be found on the wiki page
+`JSON and XML Conversion <http://wiki.open311.org/JSON_and_XML_Conversion/>`_.
+
+The base class, that not implements any particular convention, has several options that
+can be used to variate the converting process. Some of these options are not used by other
+predefined converter types (eg. *force_list* and *force_dict*) or are used with a fixed value
+(eg. *text_key* or *attr_prefix*). See :ref:`xml-schema-converters-api` for details about
+base class options and attributes.
+
+
+Create a custom converter
+-------------------------
+
+To create a new customized converter you have to subclass the :class:`XMLSchemaConverter`
+and redefine the two methods *element_decode* and *element_encode*. These methods are based
+on the namedtuple `ElementData`, an Element-like data structure that stores the decoded
+Element parts. This namedtuple is used by decoding and encoding methods as an intermediate
+data structure.
+
+The namedtuple `ElementData` has four attributes:
+
+  * **tag**: the element's tag string;
+  * **text**: the element's text, that can be a string or `None` for empty elements;
+  * **content**: the element's children, can be a list or `None`;
+  * **attributes**: the element's attributes, can be a dictionary or `None`.
+
+The method *element_decode* receives as first argument an `ElementData` instance with
+decoded data. The other arguments are the XSD element to use for decoding and the level
+of the XML decoding process, used to add indent spaces for a readable string serialization.
+This method uses the input data element to compose a decoded data, typically a dictionary
+or a list or a value for simple type elements.
+
+On the opposite the method *element_encode* receives the decoded object and decompose it
+in order to get and returns an `ElementData` instance. This instance has to contain the
+parts of the element that will be then encoded an used to build an XML Element instance.
+
+These two methods have also the responsibility to map and unmap object names, but don't
+have to decode or encode data, a task that is delegated to the methods of the XSD components.
+
+Depending on the format defined by your new converter class you may provide a different
+value for properties *lossless* and *losslessly*. The *lossless* has to be `True` if your
+new converter class preserves all XML data information (eg. as the *BadgerFish* convention).
+Your new converter can be also *losslessly* if it's lossless and the element model structure
+and order is maintained (like the JsonML convention).
+
+Furthermore your new converter class can has a more specific `__init__` method in order
+to avoid the usage of unused options or to set the value of some other options. Finally refer
+also to the code of predefined  derived converters to see how you can build your own one.
--- a/doc/index.rst
+++ b/doc/index.rst
@ -12,6 +12,7 @@ xmlschema Documentation
    intro
    usage
    api
+    converters
    testing
    notes

--- a/doc/usage.rst
+++ b/doc/usage.rst
@ -409,7 +409,7 @@ You can also change the data decoding process providing the keyword argument *co
    {'vh:bikes': {'vh:bike': [None, None]}, 'vh:cars': {'vh:car': [None, None]}}


-See the :ref:`xml-schema-converters` section for more information about converters.
+See the :ref:`customize-output-data` section for more information about converters.


 Decoding to JSON
--- a/xmlschema/converters.py
+++ b/xmlschema/converters.py
@ -23,8 +23,10 @@ from xmlschema.namespaces import NamespaceMapper
 ElementData = namedtuple('ElementData', ['tag', 'text', 'content', 'attributes'])
 """
 Namedtuple for Element data interchange between decoders and converters.
-The field *tag* is a string containing the Element tag, *text* can be `None` or
-a string representing the Element's text. The field *content*..
+The field *tag* is a string containing the Element's tag, *text* can be `None` 
+or a string representing the Element's text, *content* can be `None` or a list 
+containing the Element's children, *attributes* can be `None` or a dictionary
+containing the Element's attributes. 
 """