Commit Graph

75 Commits

Author SHA1 Message Date
Davide Brunato 24a08c4442 Add replacing of backslashes from normalize_path result 2019-11-06 10:22:09 +01:00
Davide Brunato 896982222f Fix Windows paths normalization 2019-11-06 09:49:00 +01:00
Davide Brunato a374d15805 Fix resource tests for Python 2 2019-10-24 06:37:31 +02:00
Davide Brunato c075ff22e5 Complete the revision of resource module
- normalize_url() now processes file names containing '#' chars
  - Fix iterfind() of lazy resource
  - Add more tests for XML resources
2019-10-22 18:37:26 +02:00
Davide Brunato 8dd5d193ba Update XML resource iterfind() to fix issues #102 and #112
- Speed up admitting simple paths and checking only elements
    that match path level
  - Avoid selection for * paths (about 35% faster)
  - Add close() method to XmlResource
2019-10-21 06:57:02 +02:00
Davide Brunato 43322b6bc0 Refactor XmlResource after merge
- Remove _document and _fid (use the attribute source instead)
2019-10-19 00:08:09 +02:00
Davide Brunato 9443adf396 Merge branch 'master' of github.com:sissaschool/xmlschema into develop 2019-10-17 10:42:04 +02:00
Davide Brunato 54060ba0df Modify resources.fetch_schema_locations()
- Now can returns location for another namespace if hints for
    resource namespace are missing
2019-10-16 21:14:15 +02:00
Davide Brunato 7d0d251837
Merge branch 'master' into b121 2019-10-16 18:11:49 +02:00
Davide Brunato b7b6fef418 Base modules refactoring for fix ElementTree import 2019-10-07 15:31:18 +02:00
Daniel Hillier 61e1f609fc Stop reading `name` and `url` from file object attrs
These attrs shouldn't be used to reopen the file object as:
- they may not reflect the original file or resource (file objects
  opened from a zipfile will have a name that doesn't correspond to any
  file on disk).
- Depending on how the fid was opened, these attrs could be crafted to
  read arbitrary files from disk. If the creator of a .zip gives a file
  inside the zip file a path of `/etc/passwd` we may end up opening that
  file.

Instead of reopening the file, we keep track of the file object and seek
to the beginning of the file. This means (for most operations) the file
object must be seekable. On Python 2 urlopen returns an unseekable
object for 'file://' paths. One test had to be skipped in Python 2 for
this reason.
2019-07-06 15:25:10 +10:00
Davide Brunato 81849f2368 Fix path normalization and tests for Windows platform 2019-06-19 20:02:45 +02:00
Davide Brunato 061d72c5c6 Update decode/encode methods for schemas
- Split decode()/encode() for components and for schemas
  - Removed to_dict and to_etree for XSD components
  - Updated fetch_schema_locations() to build XMLResource instance
2019-06-18 17:20:13 +02:00
Davide Brunato 9712319150 Fix issue #116 2019-06-12 09:21:37 +02:00
Davide Brunato 5d04d7e68a Implementing lazy validation
- Added XMLResource.iterfind() for XPath iteration of a resource;
  - Validator API refactored: remove path argument from iter_errors()
    of components, add validate, is_valid, iter_errors to XMLSchema
    class with additional arguments path and schema_path.
  - Fix test case patterns.xml (now find also duplicated IDs)
2019-06-05 07:01:53 +02:00
Davide Brunato ccfaab9479 Fix XML resource defusing
- Add XMLResource.defusing() for checking XML data
  - Now in XMLResource.parse() and fromsource() the SafeXMLParser is
    used only for defusing data
2019-02-25 13:34:52 +01:00
Davide Brunato 49f2fb1246 Update for close the release v1.0.9
- Fix SafeXMLParser and add tests for it
  - SafeXMLParser raises only pure Python ParseError exception
  - Add three XML cases with entities in xmlschema/tests/test_cases/resources/
2019-02-03 17:17:26 +01:00
Davide Brunato cec34eeea0 Refactor of etree.py module
- Remove SafeXMLParserError and use ElementTree.ParseError
  - PyElementTree safe APIs errors are re-raised as C mod ParseError
  - Simplify ElementTree API and XMLResource class
2019-01-25 17:55:32 +01:00
Davide Brunato 2eabc190fe Replace defusedxml dependency
- The defusedxml seems to be unmaintained and has some problems
    with the ElementTree loading
  - Replaced by a safe XMLParser that forbids entities processing
2019-01-22 17:50:13 +01:00
Davide Brunato 9d6b88baae Change copyright years info 2019-01-20 16:56:10 +01:00
Xtreak 2975238747
Fix escape sequence warning using raw string 2018-12-23 11:37:41 +05:30
Davide Brunato bd90cacc7a QNames refactoring
- Moved all QNames to the same module
  - Create module helpers.py that includes XSD parse utils and name
    manipulation helper functions
2018-10-08 23:47:18 +02:00
Davide Brunato a7f5c41a85 Fix regex module
- Added several tests
  - Fixed start and end expression in regex.get_python_regex():
    now puts '^(' and ')$' instead of '^' and '$'.
  - Fixed '.' conversion in regex.get_python_regex(): raw string
    qualifier removed from string literal.
2018-09-22 15:26:00 +02:00
Davide Brunato d72c16499b Fixes for handling of the default namespace and the XPath default namespace
- XMLResource.get_namespaces(): consider local root when adding
    another default namespace
  - Fix for XPath default namespace handling in ElementPathMixin
2018-09-02 17:00:11 +02:00
Davide Brunato 96cb4b57af Refactoring of error and to etree_tostring serialization
- Added namespaces argument to etree_tostring helper method.
  - Refactored validator error string representation.
  - Moved namespaces argument at last position for methods validate
    and iter_errors of class ValidationMixin.
  - Fixed document validate API and added tests for it
2018-08-27 15:43:19 +02:00
Davide Brunato 2ca1bb4fd3 Code cleaning for XsdGroup.iter_decode and iter_decode_children
- Fix for issue #73
  - Removed while cycle in iter_decode
  - Consider that iter_decode_children methods yield only children
    validation errors
  - Added helper function etree_last_child
2018-07-23 13:30:56 +02:00
Davide Brunato f81fc41f33 Fix test_resources.py for Windows platform
- Fix for normalize_url to replace backslashes.
  - Created a check_url method for TestResources class.
  - Use pathlib to check paths: PureWindowsPath class is
    used for every Windows path (paths that contain '\\' or
    ':' or '|'), PurePath otherwise.
  - Add leading slash to Windows paths with drive spec
    before converting to URL.
2018-07-18 14:49:05 +02:00
Davide Brunato e967aa6db8 Use pathlib for checking resource URLs 2018-07-18 11:30:13 +02:00
Davide Brunato 0af5307e6c Fix path normalization for Windows platform 2018-07-14 15:52:03 +02:00
Davide Brunato 80c200f651 Add is_etree_element helper function
- This is a more safer test for Element objects for this package, because it
    also checks that the argument is not an instance of ElementPathMixin class
  - Add tests for fetch_schema_locations and load_xml_resource functions
2018-07-13 22:40:35 +02:00
Davide Brunato 7980ea4e35 Completed tests for XMLResource class 2018-07-13 16:51:26 +02:00
Davide Brunato 684558794e Resource api normalize_url rewritten
- Now uses os.path.join for all URLs related to files, without
    mess up relative paths.
  - Added keep_relative=False optional argument.
2018-07-13 12:53:34 +02:00
Davide Brunato c72442ed3c Fix XMLResource._fromsource() internal method
- File object processing miss to find URL
  - Considering StringIO processing
  - Add tests for resource APIs
2018-07-12 19:04:03 +02:00
Davide Brunato dd43288abd Update test case parser for release
- "--defaults" option removed (cases merged within --skip)
  - "--baseurl" removed (useless, the cases are all URL-based)
2018-07-12 17:32:36 +02:00
Davide Brunato c2a81102fc Update XMLResource class 2018-07-12 13:48:14 +02:00
Davide Brunato 472c9faddf Modify XMLResource to insert document and skip source loading
- ElementTree.parse() used when an URL is available.
  - Skip source loading if not explicity requested (also if it's lazy).
  - Add a remote test with Dublin Core schemas.
2018-07-12 10:30:18 +02:00
Davide Brunato e06e5571a0 Fix resource locations processing for namespace imports
- Added copy(), __repr__ and __str__ to XMLResource class.
  - Fixed XMLResource.get_locations(): now accepts also locations
    already stored into a NamespaceResourcesMap dictionary.
  - XMLSchema.iter_decode now always creates an XMLResource from
    the source argument.
  - Fix for XMLSchema.built property: has to count all namespace globals.
2018-07-11 14:13:23 +02:00
Davide Brunato a388689cc1 Complete resource API refactoring
- Set default timeout=30 to fetch_* functions
  - Use keyword arguments (**resource_options) for providing options
    for XML resource related helper functions
  - Add base_url to module level API
2018-07-11 08:58:35 +02:00
Davide Brunato afa7299a72 Written XMLResource.get_locations() method 2018-07-10 15:02:07 +02:00
Davide Brunato dc87c4ef61 Added fetch_namespaces() to resource API
- etree_get_namespaces() removed from etree.py but left in module
    as alias of fetch_namespaces for back compatibility.
2018-07-10 12:41:39 +02:00
Davide Brunato 6c888747a5 Convert resource-related schema attributes to properties
- root, url, text, defuse, timeout changed to properties
2018-07-10 11:12:53 +02:00
Davide Brunato 779b66a622 Rewrite resource API inside
- Added XMLResource class for representing XML data sources
  - Resource API now are based mainly on this class
  - Attribute 'source' added to schema instances
2018-07-10 10:40:29 +02:00
Davide Brunato 93a8d3068d Replace allow_overrides tentative with base_dir 2018-07-09 15:11:47 +02:00
Davide Brunato 3e6ce18ae4 Rewritten includes and imports for schema initialization
- Now a warning message is sent to the logger for include or
    for namespace import errors
  - Add XMLSchemaImportWarning and XMLSchemaIncludeWarning
  - Add warning attribute to schemas for collecting the message
    strings about include and import warnings
  - URIDict class removed (faulty with empty fragment #)
  - Added --warning and --timeout to test factory arguments
  - Removed --network from test factory arguments
2018-07-07 11:25:00 +02:00
Davide Brunato e3bf188413 Add 'timeout' argument to schema and 'defuse' and 'timeout' to resource API 2018-07-05 12:30:57 +02:00
Davide Brunato fe9040e258 Implement lossless and losslessly properties for converters 2018-07-05 09:33:49 +02:00
Davide Brunato b697c6f7b9 Fix a bug for normalize_url()
- The normalization failed when url contains a .. or . subpaths
    and base_url with a valid scheme is provided
  - The fix could be also influence issue #44
2018-07-04 18:41:40 +02:00
Davide Brunato 3df48cc54d Fix for default converter element_decode in case of list of values 2018-07-03 23:11:27 +02:00
Davide Brunato 4128f21548 Improve converters and namespace decoding
- Fixed NamespaceMapper base class
  - Added level=0 argument to XMLSchemaConverter.element_decode()
2018-06-27 15:06:14 +02:00
Davide Brunato a05abfe7ea Completion of the release 0.9.31
- Decoder tests completed for the default converter
2018-06-24 16:45:39 +02:00