Create Sphinx-based documentation

This commit is contained in:
Davide Brunato 2018-05-07 08:35:28 +02:00
parent db170d5bca
commit f2cbd6e401
12 changed files with 428 additions and 74 deletions

View File

@ -1,6 +1,8 @@
***********
===========
elementpath
***********
===========
.. elementpath-introduction
The proposal of this package is to provide XPath 1.0 and 2.0 selectors for Python's ElementTree XML
data structures, both for the standard ElementTree library and for the
@ -14,7 +16,7 @@ provides. If you want you can contribute to add an unimplemented function see th
Installation and usage
======================
----------------------
You can install the package with *pip* in a Python 2.7 or Python 3.3+ environment::
@ -22,79 +24,49 @@ You can install the package with *pip* in a Python 2.7 or Python 3.3+ environmen
For using import the package and apply the selectors on ElementTree nodes:
.. code-block:: pycon
.. doctest::
>>> import elementpath
>>> from xml.etree import ElementTree
>>> xt = ElementTree.XML('<A><B1/><B2><C1/><C2/><C3/></B2></A>')
>>> elementpath.select(xt, '/A/B2/*')
...
>>> root = ElementTree.XML('<A><B1/><B2><C1/><C2/><C3/></B2></A>')
>>> elementpath.select(root, '/A/B2/*')
[<Element 'C1' at ...>, <Element 'C2' at ...>, <Element 'C3' at ...>]
The *select* API provides the standard XPath result format that can be a list or a built-in
basic data value. If you want only to iterate over results you can use the generator function
*iter_select* that accepts the same arguments of *select*.
The selectors API works also using XML data trees based on the `lxml.etree <http://lxml.de>`_
library:
.. doctest::
>>> import elementpath
>>> import lxml.etree as etree
>>> root = etree.XML('<A><B1/><B2><C1/><C2/><C3/></B2></A>')
>>> elementpath.select(root, '/A/B2/*')
[<Element C1 at ...>, <Element C2 at ...>, <Element C3 at ...>]
Public API
==========
When you need to apply the same XPath expression to several XML data you can also use the
*Selector* class, creating an instance and then using it to apply the path on distinct XML
data:
The package includes some classes and functions for XPath parsers and selectors.
.. doctest::
XPath1Parser
------------
.. code-block:: python
class XPath1Parser(namespaces=None, variables=None, strict=True)
The XPath 1.0 parser. Provide a *namespaces* dictionary argument for mapping namespace prefixes to URI
inside expressions. With *variables* you can pass a dictionary with the static context's in-scope variables.
If *strict* is set to `False` the parser enables parsing of QNames, like the ElementPath library.
XPath2Parser
------------
.. code-block:: python
XPath2Parser(namespaces=None, variables=None, strict=True, default_namespace='', function_namespace=None,
schema=None, build_constructors=False, compatibility_mode=False)
The XPath 2.0 parser, that is the default parser. It has additional arguments compared to the parent class.
*default_namespace* is the namespace to apply to unprefixed names. For default no namespace is applied
(the empty namespace '').
*function_namespace* is the default namespace to apply to unprefixed function names (the
"http://www.w3.org/2005/xpath-functions" namespace for default).
*schema* is an optional instance of an XML Schema interface as defined by the abstract class
`AbstractSchemaProxy`.
*build_constructors* indicates when to define constructor functions for the in-scope XSD atomic types.
The *compatibility_mode* flag indicates if the XPath 2.0 parser has to work in compatibility
with XPath 1.0.
XPath selectors
---------------
.. code-block:: python
select(root, path, namespaces=None, schema=None, parser=XPath2Parser)
Apply *path* expression on *root* Element. The *root* argument can be an ElementTree instance
or an Element instance.
Returns a list with XPath nodes or a basic type for expressions based on a function or literal.
.. code-block:: python
iter_select(root, path, namespaces=None, schema=None, parser=XPath2Parser)
Iterator version of *select*, if you want to process each result one by one.
.. code-block:: python
Selector(path, namespaces=None, schema=None, parser=XPath2Parser)
Create an instance of this class if you want to apply an XPath selector to several target data.
An instance provides *select* and *iter_select* methods with a *root* argument that has the
same meaning that as for the *select* API.
>>> import elementpath
>>> import lxml.etree as etree
>>> selector = elementpath.Selector('/A/*/*')
>>> root = etree.XML('<A><B1/><B2><C1/><C2/><C3/></B2></A>')
>>> selector.select(root)
[<Element C1 at ...>, <Element C2 at ...>, <Element C3 at ...>]
>>> root = etree.XML('<A><B1><C0/></B1><B2><C1/><C2/><C3/></B2></A>')
>>> selector.select(root)
[<Element C0 at ...>, <Element C1 at ...>, <Element C2 at ...>, <Element C3 at ...>]
Contributing
============
------------
You can contribute to this package reporting bugs, using the issue tracker or by a pull request.
In case you open an issue please try to provide a test or test data for reproducing the wrong
@ -111,7 +83,8 @@ implement other types of parsers (I think it could be also a funny exercise!).
License
=======
-------
This software is distributed under the terms of the MIT License.
See the file 'LICENSE' in the root directory of the present
distribution, or http://opensource.org/licenses/MIT.

20
doc/Makefile Normal file
View File

@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SPHINXPROJ = elementpath
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

186
doc/conf.py Normal file
View File

@ -0,0 +1,186 @@
# -*- coding: utf-8 -*-
#
# Configuration file for the Sphinx documentation builder.
#
# This file does only contain a selection of the most common options. For a
# full list see the documentation:
# http://www.sphinx-doc.org/en/stable/config
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# Extends the path with parent directory in order to import elementpath from
# the project directory also if it's installed.
import sys
import os
sys.path.insert(0, os.path.abspath('..'))
# -- Project information -----------------------------------------------------
project = 'elementpath'
copyright = '2018, Davide Brunato'
author = 'Davide Brunato'
# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = '1.0.6'
# -- General configuration ---------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.doctest',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path .
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# The default sidebars (for documents that don't match any pattern) are
# defined by theme itself. Builtin themes are using these templates by
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
# 'searchbox.html']``.
#
# html_sidebars = {}
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'elementpathdoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'elementpath.tex', 'elementpath Documentation',
'Davide Brunato', 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'elementpath', 'elementpath Documentation',
[author], 1)
]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'elementpath', 'elementpath Documentation',
author, 'elementpath', 'One line description of project.',
'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
epub_author = author
epub_publisher = author
epub_copyright = copyright
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# -- Extension configuration -------------------------------------------------

15
doc/index.rst Normal file
View File

@ -0,0 +1,15 @@
.. elementpath documentation master file, created by
sphinx-quickstart on Fri May 4 19:54:35 2018.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
elementpath documentation
=========================
.. toctree::
:maxdepth: 2
introduction
xpath_api
pratt_api

5
doc/introduction.rst Normal file
View File

@ -0,0 +1,5 @@
Introduction
============
.. include:: ../README.rst
:start-after: elementpath-introduction

36
doc/make.bat Normal file
View File

@ -0,0 +1,36 @@
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
set SPHINXPROJ=elementpath
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
:end
popd

33
doc/pratt_api.rst Normal file
View File

@ -0,0 +1,33 @@
Pratt's parser API
==================
The TDOP (Top Down Operator Precedence) parser implemented within this library is variant of the original
Pratt's parser based on a class for the parser and metaclasses for tokens.
The parser base class includes helper functions for registering token classes,
the Pratt's methods and a regexp-based tokenizer builder. There are also additional
methods and attributes to help the developing of new parsers. Parsers can be defined
by class derivation and following a tokens registration procedure.
Token Base Class
----------------
.. autoclass:: elementpath.Token
Parser Base Class
-----------------
.. autoclass:: elementpath.Parser
.. automethod:: build_tokenizer
.. automethod:: parse
.. automethod:: advance
.. automethod:: expression

32
doc/xpath_api.rst Normal file
View File

@ -0,0 +1,32 @@
Public XPath API
================
The package includes some classes and functions that implement XPath parsers and selectors.
XPath parsers
-------------
.. autoclass:: elementpath.XPath1Parser
.. autoclass:: elementpath.XPath2Parser
XPath selectors
---------------
.. autofunction:: elementpath.select
.. autofunction:: elementpath.iter_select
.. autoclass:: elementpath.Selector
.. autoattribute:: namespaces
.. automethod:: select
.. automethod:: iter_select
XPath dynamic context
---------------------
.. autoclass:: elementpath.XPathContext

View File

@ -22,19 +22,21 @@ from .xpath_helpers import AttributeNode, NamespaceNode, UntypedAtomic
from .xpath_token import XPathToken
from .xpath_context import XPathContext
from .xpath1_parser import XPath1Parser
from .xpath2_parser import XPath2Parser
from .xpath2_parser import XPath2Parser as XPath2Parser
from .schema_proxy import AbstractSchemaProxy, XMLSchemaProxy
def select(root, path, namespaces=None, parser=XPath2Parser, **kwargs):
"""
XPath selector function.
XPath selector function that apply a *path* expression on *root* Element.
:param root: An Element or ElementTree instance.
:param path: The XPath expression.
:param namespaces: A dictionary with mapping from namespace prefixes into URIs.
:param parser: The parser class to use, that is the XPath 2.0 class for default.
:param kwargs: Other optional parameters for XPath parser class.
:return: A list with XPath nodes or a basic type for expressions based \
on a function or literal.
"""
parser = parser(namespaces, **kwargs)
root_token = parser.parse(path)
@ -44,15 +46,16 @@ def select(root, path, namespaces=None, parser=XPath2Parser, **kwargs):
def iter_select(root, path, namespaces=None, parser=XPath2Parser, **kwargs):
"""
XPath selector generator function.
A function that creates an XPath selector generator for apply a *path* expression
on *root* Element.
:param root: An Element or ElementTree instance.
:param path: The XPath expression.
:param namespaces: A dictionary with mapping from namespace prefixes into URIs.
:param parser: The parser class to use, that is the XPath 2.0 class for default.
:param kwargs: Other optional parameters for XPath parser class.
:return: A generator of the XPath expression results.
"""
parser = parser(namespaces, **kwargs)
root_token = parser.parse(path)
context = XPathContext(root)
@ -61,12 +64,20 @@ def iter_select(root, path, namespaces=None, parser=XPath2Parser, **kwargs):
class Selector(object):
"""
XPath selector class.
XPath selector class. Create an instance of this class if you want to apply an XPath
selector to several target data.
:param path: The XPath expression.
:param namespaces: A dictionary with mapping from namespace prefixes into URIs.
:param parser: The parser class to use, that is the XPath 2.0 class for default.
:param kwargs: Other optional parameters for XPath parser class.
:ivar path: The XPath expression.
:vartype path: str
:ivar parser: The parser instance.
:vartype parser: XPath1Parser or XPath2Parser
:ivar root_token: The root of tokens tree compiled from path.
:vartype root_token: XPathToken
"""
def __init__(self, path, namespaces=None, parser=XPath2Parser, **kwargs):
self.path = path
@ -80,12 +91,27 @@ class Selector(object):
@property
def namespaces(self):
"""A dictionary with mapping from namespace prefixes into URIs."""
return self.parser.namespaces
def select(self, root):
"""
Applies the instance's XPath expression on *root* Element.
:param root: An Element or ElementTree instance.
:return: A list with XPath nodes or a basic type for expressions based on \
a function or literal.
"""
context = XPathContext(root)
return self.root_token.get_results(context)
def iter_select(self, root):
"""
Creates an XPath selector generator for apply the instance's XPath expression
on *root* Element.
:param root: An Element or ElementTree instance.
:return: A generator of the XPath expression results.
"""
context = XPathContext(root)
return self.root_token.select(context)

View File

@ -178,6 +178,17 @@ class Token(MutableSequence):
class Parser(object):
"""
Parser class for implementing a version of a Top Down Operator Precedence parser.
:cvar symbol_table: A dictionary that stores the token classes defined for the language.
:type symbol_table: dict
:cvar token_base_class: The base class for creating language's token classes.
:type token_base_class: Token
:cvar tokenizer: The language tokenizer compiled regexp.
:cvar SYMBOLS: A unified list of the definable tokens. It's an optional list useful \
if you want to make sure all language's symbols are included and defined.
"""
symbol_table = {}
token_base_class = Token
tokenizer = None
@ -230,6 +241,12 @@ class Parser(object):
self.source = ''
def parse(self, source):
"""
The method for parsing a source code of the formal language.
:param source: The source string.
:return: The root of the token's tree that parse the source.
"""
try:
self.source = source
self.tokens = iter(self.tokenizer.finditer(source))

View File

@ -34,7 +34,11 @@ XML_NCNAME_PATTERN = u"[{0}][\-.0-9\u00B7\u0300-\u036F\u203F-\u2040{0}]*".format
class XPath1Parser(Parser):
"""
XPath 1.0 expression parser class. The parser instance represents also the XPath static context.
XPath 1.0 expression parser class. A parser instance represents also the XPath static context.
With *variables* you can pass a dictionary with the static context's in-scope variables.
Provide a *namespaces* dictionary argument for mapping namespace prefixes to URI inside
expressions. If *strict* is set to `False` the parser enables also the parsing of QNames,
like the ElementPath library.
:param namespaces: A dictionary with mapping from namespace prefixes into URIs.
:param variables: A dictionary with the static context's in-scope variables.

View File

@ -29,10 +29,17 @@ from .schema_proxy import AbstractSchemaProxy
class XPath2Parser(XPath1Parser):
"""
XPath 2.0 expression parser class. The parser instance represents also the XPath static context.
XPath 2.0 expression parser class. This is the default parser used by XPath selectors.
A parser instance represents also the XPath static context. With *variables* you can pass
a dictionary with the static context's in-scope variables.
Provide a *namespaces* dictionary argument for mapping namespace prefixes to URI inside
expressions. If *strict* is set to `False` the parser enables also the parsing of QNames,
like the ElementPath library. There are some additional XPath 2.0 related arguments.
:param namespaces: A dictionary with mapping from namespace prefixes into URIs.
:param variables: A dictionary with the static context's in-scope variables.
:param strict: If strict mode is `False` the parser enables parsing of QNames, \
like the ElementPath library. Default is `True`.
:param default_namespace: The default namespace to apply to unprefixed names. \
For default no namespace is applied (empty namespace '').
:param function_namespace: The default namespace to apply to unprefixed function names. \