Merge branch 'perf_cache_resolving'

* perf_cache_resolving:
  Squashed 'json/' changes from 9208016..0b657e8
  Need to preserve backwards compat for RefResolvers without the new methods.
  Pass in caches instead of arguments.
  I give up.
  Not deprecating these for now, just not used internally.
  Fix base_uri backwards compatibility.
  Er, green doesn't work on 2.6, and make running right out of a checkout easier.
  Wrong docstring.
  Add back assertions for backwards compat.
  Wait wat. Remove insanity.
  Probably should combine these at some point, but for now move them.
  Really run on the installed package.
  Begone py.test.
  Remove 3.3, use pip for installs, use green here too.
  lxml-cffi is giving obscure errors again.
  Fix a non-type in the docs.
  Switch to vcversioner, use repoze.lru only on 2.6, and add extras_require for format.
  Run tests on the installed package.
  Newer tox is slightly saner.
  It's hard to be enthusiastic about tox anymore.
  Use lru_cache
  Remove DefragResult.
  Remove context manager from ref() validation.
  Perf improvements by using a cache.
  Add benchmark script.
  Fix test failures
  issue #158: TRY to speed-up scope & $ref url-handling by keeping fragments separated from URL (and avoid redunant frag/defrag). Conflicts: 	jsonschema/tests/test_benchmarks.py
This commit is contained in:
Julian Berman 2015-04-05 20:32:04 -04:00
commit 60fcbbf962
No known key found for this signature in database
GPG Key ID: 3F8D9C8C011729F8
18 changed files with 347 additions and 119 deletions

21
.gitignore vendored
View File

@ -1,26 +1,5 @@
.DS_Store
.idea
*.pyc
*.pyo
*.egg-info
_build
build
dist
MANIFEST
.coverage
.coveragerc
coverage
htmlcov
_cache
_static
_templates
_trial_temp
.tox
TODO

View File

@ -1,14 +1,5 @@
language: python
python:
- "pypy"
- "pypy3"
- "2.6"
- "2.7"
- "3.3"
- "3.4"
install:
- python setup.py -q install
script:
- if [[ "$(python -c 'import sys; print(sys.version_info[:2])')" == "(2, 6)" ]]; then pip install unittest2; fi
- py.test --tb=native jsonschema
- python -m doctest README.rst
- tox

View File

@ -1,4 +1,5 @@
include *.rst
include COPYING
include tox.ini
include version.txt
recursive-include json *

View File

@ -63,12 +63,8 @@ now uses setuptools.
Running the Test Suite
----------------------
``jsonschema`` uses the wonderful `Tox <http://tox.readthedocs.org>`_ for its
test suite. (It really is wonderful, if for some reason you haven't heard of
it, you really should use it for your projects).
Assuming you have ``tox`` installed (perhaps via ``pip install tox`` or your
package manager), just run ``tox`` in the directory of your source checkout to
If you have ``tox`` installed (perhaps via ``pip install tox`` or your
package manager), running``tox`` in the directory of your source checkout will
run ``jsonschema``'s test suite on all of the versions of Python ``jsonschema``
supports. Note that you'll need to have all of those versions installed in
order to run the tests on each of them, otherwise ``tox`` will skip (and fail)

74
benchmarks/bench.py Normal file
View File

@ -0,0 +1,74 @@
#!/usr/env/bin python
"""
Benchmark the performance of jsonschema.
Example benchmark:
wget http://swagger.io/v2/schema.json
wget http://petstore.swagger.io/v2/swagger.json
python bench.py -r 5 schema.json swagger.json
"""
from __future__ import print_function
import argparse
import cProfile
import json
import time
import jsonschema
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('schema', help="path to a schema used to benchmark")
parser.add_argument('document', help="document to validate with schema")
parser.add_argument('-r', '--repeat', type=int, help="number of iterations")
parser.add_argument('--profile',
help="Enable profiling, write profile to this filepath")
return parser.parse_args()
def run(filename, schema, document):
resolver = jsonschema.RefResolver(
'file://{0}'.format(filename),
schema,
store={schema['id']: schema})
jsonschema.validate(document, schema, resolver=resolver)
def format_time(time_):
return "%.3fms" % (time_ * 1000)
def run_timeit(schema_filename, document_filename, repeat, profile):
with open(schema_filename) as schema_file:
schema = json.load(schema_file)
with open(document_filename) as fh:
document = json.load(fh)
if profile:
profiler = cProfile.Profile()
profiler.enable()
times = []
for _ in range(repeat):
start_time = time.time()
run(schema_filename, schema, document)
times.append(time.time() - start_time)
if profile:
profiler.disable()
profiler.dump_stats(profile)
print(", ".join(map(format_time, sorted(times))))
print("Mean: {0}".format(format_time(sum(times) / repeat)))
def main():
args = parse_args()
run_timeit(args.schema, args.document, args.repeat, args.profile)
if __name__ == "__main__":
main()

View File

@ -60,23 +60,69 @@ Who Uses the Test Suite
This suite is being used by:
* [jsck (a fast JSON validator in CoffeeScript)](https://github.com/pandastrike/jsck)
* [json-schema-validator (Java)](https://github.com/fge/json-schema-validator)
* [jsonschema (python)](https://github.com/Julian/jsonschema)
* [aeson-schema (haskell)](https://github.com/timjb/aeson-schema)
* [direct-schema (javascript)](https://github.com/IreneKnapp/direct-schema)
* [jsonschema (javascript)](https://github.com/tdegrunt/jsonschema)
* [JaySchema (javascript)](https://github.com/natesilva/jayschema)
* [z-schema (javascript)](https://github.com/zaggino/z-schema)
* [jassi (javascript)](https://github.com/iclanzan/jassi)
* [json-schema-valid (javascript)](https://github.com/ericgj/json-schema-valid)
* [jesse (Erlang)](https://github.com/klarna/jesse)
* [json-schema (PHP)](https://github.com/justinrainbow/json-schema)
* [gojsonschema (Go)](https://github.com/sigu-399/gojsonschema)
* [json_schema (Dart)](https://github.com/patefacio/json_schema)
* [tv4 (JavaScript)](https://github.com/geraintluff/tv4)
* [Jsonary (JavaScript)](https://github.com/jsonary-js/jsonary)
* [json-schema (Ruby)](https://github.com/hoxworth/json-schema)
### Coffeescript ###
* [jsck](https://github.com/pandastrike/jsck)
### Dart ###
* [json_schema](https://github.com/patefacio/json_schema)
### Erlang ###
* [jesse](https://github.com/klarna/jesse)
### Go ###
* [gojsonschema](https://github.com/sigu-399/gojsonschema)
### Haskell ###
* [aeson-schema](https://github.com/timjb/aeson-schema)
* [hjsonschema](https://github.com/seagreen/hjsonschema)
### Java ###
* [json-schema-validator](https://github.com/fge/json-schema-validator)
### Javascript ###
* [json-schema-benchmark](https://github.com/Muscula/json-schema-benchmark)
* [direct-schema](https://github.com/IreneKnapp/direct-schema)
* [is-my-json-valid](https://github.com/mafintosh/is-my-json-valid)
* [jassi](https://github.com/iclanzan/jassi)
* [JaySchema](https://github.com/natesilva/jayschema)
* [json-schema-valid](https://github.com/ericgj/json-schema-valid)
* [Jsonary](https://github.com/jsonary-js/jsonary)
* [jsonschema](https://github.com/tdegrunt/jsonschema)
* [request-validator](https://github.com/bugventure/request-validator)
* [skeemas](https://github.com/Prestaul/skeemas)
* [tv4](https://github.com/geraintluff/tv4)
* [z-schema](https://github.com/zaggino/z-schema)
### .NET ###
* [Newtonsoft.Json.Schema](https://github.com/JamesNK/Newtonsoft.Json.Schema)
### PHP ###
* [json-schema](https://github.com/justinrainbow/json-schema)
### Python ###
* [jsonschema](https://github.com/Julian/jsonschema)
### Ruby ###
* [json-schema](https://github.com/hoxworth/json-schema)
### Rust ###
* [valico](https://github.com/rustless/valico)
### Swift ###
* [JSONSchema](https://github.com/kylef/JSONSchema.swift)
If you use it as well, please fork and send a pull request adding yourself to
the list :).

View File

@ -55,6 +55,25 @@
}
]
},
{
"description":
"additionalProperties can exist by itself",
"schema": {
"additionalProperties": {"type": "boolean"}
},
"tests": [
{
"description": "an additional valid property is valid",
"data": {"foo" : true},
"valid": true
},
{
"description": "an additional invalid property is invalid",
"data": {"foo" : 1},
"valid": false
}
]
},
{
"description": "additionalProperties are allowed by default",
"schema": {"properties": {"foo": {}, "bar": {}}},

View File

@ -55,6 +55,25 @@
}
]
},
{
"description":
"additionalProperties can exist by itself",
"schema": {
"additionalProperties": {"type": "boolean"}
},
"tests": [
{
"description": "an additional valid property is valid",
"data": {"foo" : true},
"valid": true
},
{
"description": "an additional invalid property is invalid",
"data": {"foo" : 1},
"valid": false
}
]
},
{
"description": "additionalProperties are allowed by default",
"schema": {"properties": {"foo": {}, "bar": {}}},

View File

@ -19,8 +19,6 @@ from jsonschema.validators import (
Draft3Validator, Draft4Validator, RefResolver, validate
)
__version__ = "2.5.0-dev"
from jsonschema._version import __version__
# flake8: noqa

View File

@ -190,9 +190,20 @@ def enum(validator, enums, instance, schema):
def ref(validator, ref, instance, schema):
with validator.resolver.resolving(ref) as resolved:
for error in validator.descend(instance, resolved):
yield error
resolve = getattr(validator.resolver, "resolve", None)
if resolve is None:
with validator.resolver.resolving(ref) as resolved:
for error in validator.descend(instance, resolved):
yield error
else:
scope, resolved = validator.resolver.resolve(ref)
validator.resolver.push_scope(scope)
try:
for error in validator.descend(instance, resolved):
yield error
finally:
validator.resolver.pop_scope()
def type_draft3(validator, types, instance, schema):

5
jsonschema/_version.py Normal file
View File

@ -0,0 +1,5 @@
# This file is automatically generated by setup.py.
__version__ = '2.3.0.post133'
__sha__ = 'g8ebd5bc'
__revision__ = 'g8ebd5bc'

View File

@ -1,6 +1,6 @@
from __future__ import unicode_literals
import sys
import operator
import sys
try:
from collections import MutableMapping, Sequence # noqa
@ -8,9 +8,11 @@ except ImportError:
from collections.abc import MutableMapping, Sequence # noqa
PY3 = sys.version_info[0] >= 3
PY26 = sys.version_info[:2] == (2, 6)
if PY3:
zip = zip
from functools import lru_cache
from io import StringIO
from urllib.parse import (
unquote, urljoin, urlunsplit, SplitResult, urlsplit as _urlsplit
@ -31,6 +33,11 @@ else:
int_types = int, long
iteritems = operator.methodcaller("iteritems")
if PY26:
from repoze.lru import lru_cache
else:
from functools32 import lru_cache
# On python < 3.3 fragments are not handled properly with unknown schemes
def urlsplit(url):

View File

@ -633,17 +633,32 @@ class ValidatorTestMixin(object):
resolver = RefResolver("", {})
schema = {"$ref" : mock.Mock()}
@contextmanager
def resolving():
yield {"type": "integer"}
with mock.patch.object(resolver, "resolving") as resolve:
resolve.return_value = resolving()
with mock.patch.object(resolver, "resolve") as resolve:
resolve.return_value = "url", {"type": "integer"}
with self.assertRaises(ValidationError):
self.validator_class(schema, resolver=resolver).validate(None)
resolve.assert_called_once_with(schema["$ref"])
def test_it_delegates_to_a_legacy_ref_resolver(self):
"""
Legacy RefResolvers support only the context manager form of
resolution.
"""
class LegacyRefResolver(object):
@contextmanager
def resolving(this, ref):
self.assertEqual(ref, "the ref")
yield {"type" : "integer"}
resolver = LegacyRefResolver()
schema = {"$ref" : "the ref"}
with self.assertRaises(ValidationError):
self.validator_class(schema, resolver=resolver).validate(None)
def test_is_type_is_true_for_valid_type(self):
self.assertTrue(self.validator.is_type("foo", "string"))
@ -775,11 +790,11 @@ class TestRefResolver(unittest.TestCase):
self.assertEqual(resolved, self.referrer["properties"]["foo"])
def test_it_resolves_local_refs_with_id(self):
schema = {"id": "foo://bar/schema#", "a": {"foo": "bar"}}
schema = {"id": "http://bar/schema#", "a": {"foo": "bar"}}
resolver = RefResolver.from_schema(schema)
with resolver.resolving("#/a") as resolved:
self.assertEqual(resolved, schema["a"])
with resolver.resolving("foo://bar/schema#/a") as resolved:
with resolver.resolving("http://bar/schema#/a") as resolved:
self.assertEqual(resolved, schema["a"])
def test_it_retrieves_stored_refs(self):
@ -816,6 +831,7 @@ class TestRefResolver(unittest.TestCase):
schema = {"id" : "foo"}
resolver = RefResolver.from_schema(schema)
self.assertEqual(resolver.base_uri, "foo")
self.assertEqual(resolver.resolution_scope, "foo")
with resolver.resolving("") as resolved:
self.assertEqual(resolved, schema)
with resolver.resolving("#") as resolved:
@ -829,6 +845,7 @@ class TestRefResolver(unittest.TestCase):
schema = {}
resolver = RefResolver.from_schema(schema)
self.assertEqual(resolver.base_uri, "")
self.assertEqual(resolver.resolution_scope, "")
with resolver.resolving("") as resolved:
self.assertEqual(resolved, schema)
with resolver.resolving("#") as resolved:
@ -863,9 +880,7 @@ class TestRefResolver(unittest.TestCase):
)
with resolver.resolving(ref):
pass
with resolver.resolving(ref):
pass
self.assertEqual(foo_handler.call_count, 2)
self.assertEqual(foo_handler.call_count, 1)
def test_if_you_give_it_junk_you_get_a_resolution_error(self):
ref = "foo://bar"
@ -876,6 +891,13 @@ class TestRefResolver(unittest.TestCase):
pass
self.assertEqual(str(err.exception), "Oh no! What's this?")
def test_helpful_error_message_on_failed_pop_scope(self):
resolver = RefResolver("", {})
resolver.pop_scope()
with self.assertRaises(RefResolutionError) as exc:
resolver.pop_scope()
self.assertIn("Failed to pop the scope", str(exc.exception))
def sorted_errors(errors):
def key(error):

View File

@ -12,7 +12,7 @@ except ImportError:
from jsonschema import _utils, _validators
from jsonschema.compat import (
Sequence, urljoin, urlsplit, urldefrag, unquote, urlopen,
str_types, int_types, iteritems,
str_types, int_types, iteritems, lru_cache,
)
from jsonschema.exceptions import ErrorTree # Backwards compatibility # noqa
from jsonschema.exceptions import RefResolutionError, SchemaError, UnknownType
@ -79,7 +79,10 @@ def create(meta_schema, validators=(), version=None, default_types=None): # noq
if _schema is None:
_schema = self.schema
with self.resolver.in_scope(_schema.get(u"id", u"")):
scope = _schema.get(u"id")
if scope:
self.resolver.push_scope(scope)
try:
ref = _schema.get(u"$ref")
if ref is not None:
validators = [(u"$ref", ref)]
@ -103,6 +106,9 @@ def create(meta_schema, validators=(), version=None, default_types=None): # noq
if k != u"$ref":
error.schema_path.appendleft(k)
yield error
finally:
if scope:
self.resolver.pop_scope()
def descend(self, instance, schema, path=None, schema_path=None):
for error in self.iter_errors(instance, schema):
@ -227,19 +233,33 @@ class RefResolver(object):
first resolution
:argument dict handlers: a mapping from URI schemes to functions that
should be used to retrieve them
:arguments functools.lru_cache urljoin_cache: a cache that will be used for
caching the results of joining the resolution scope to subscopes.
:arguments functools.lru_cache remote_cache: a cache that will be used for
caching the results of resolved remote URLs.
"""
def __init__(
self, base_uri, referrer, store=(), cache_remote=True, handlers=(),
self,
base_uri,
referrer,
store=(),
cache_remote=True,
handlers=(),
urljoin_cache=None,
remote_cache=None,
):
self.base_uri = base_uri
self.resolution_scope = base_uri
# This attribute is not used, it is for backwards compatibility
if urljoin_cache is None:
urljoin_cache = lru_cache(1024)(urljoin)
if remote_cache is None:
remote_cache = lru_cache(1024)(self.resolve_from_url)
self.referrer = referrer
self.cache_remote = cache_remote
self.handlers = dict(handlers)
self._scopes_stack = [base_uri]
self.store = _utils.URIDict(
(id, validator.META_SCHEMA)
for id, validator in iteritems(meta_schemas)
@ -247,26 +267,52 @@ class RefResolver(object):
self.store.update(store)
self.store[base_uri] = referrer
self._urljoin_cache = urljoin_cache
self._remote_cache = remote_cache
@classmethod
def from_schema(cls, schema, *args, **kwargs):
"""
Construct a resolver from a JSON schema object.
:argument schema schema: the referring schema
:argument schema: the referring schema
:rtype: :class:`RefResolver`
"""
return cls(schema.get(u"id", u""), schema, *args, **kwargs)
def push_scope(self, scope):
self._scopes_stack.append(
self._urljoin_cache(self.resolution_scope, scope),
)
def pop_scope(self):
try:
self._scopes_stack.pop()
except IndexError:
raise RefResolutionError(
"Failed to pop the scope from an empty stack. "
"`pop_scope()` should only be called once for every "
"`push_scope()`",
)
@property
def resolution_scope(self):
return self._scopes_stack[-1]
@property
def base_uri(self):
uri, _ = urldefrag(self.resolution_scope)
return uri
@contextlib.contextmanager
def in_scope(self, scope):
old_scope = self.resolution_scope
self.resolution_scope = urljoin(old_scope, scope)
self.push_scope(scope)
try:
yield
finally:
self.resolution_scope = old_scope
self.pop_scope()
@contextlib.contextmanager
def resolving(self, ref):
@ -278,25 +324,28 @@ class RefResolver(object):
"""
full_uri = urljoin(self.resolution_scope, ref)
uri, fragment = urldefrag(full_uri)
if not uri:
uri = self.base_uri
url, resolved = self.resolve(ref)
self.push_scope(url)
try:
yield resolved
finally:
self.pop_scope()
if uri in self.store:
document = self.store[uri]
else:
def resolve(self, ref):
url = self._urljoin_cache(self.resolution_scope, ref)
return url, self._remote_cache(url)
def resolve_from_url(self, url):
url, fragment = urldefrag(url)
try:
document = self.store[url]
except KeyError:
try:
document = self.resolve_remote(uri)
document = self.resolve_remote(url)
except Exception as exc:
raise RefResolutionError(exc)
old_base_uri, self.base_uri = self.base_uri, uri
try:
with self.in_scope(uri):
yield self.resolve_fragment(document, fragment)
finally:
self.base_uri = old_base_uri
return self.resolve_fragment(document, fragment)
def resolve_fragment(self, document, fragment):
"""

View File

@ -1,9 +1,10 @@
import os
import sys
from setuptools import setup
from jsonschema import __version__
with open("README.rst") as readme:
with open(os.path.join(os.path.dirname(__file__), "README.rst")) as readme:
long_description = readme.read()
classifiers = [
@ -21,11 +22,22 @@ classifiers = [
"Programming Language :: Python :: Implementation :: PyPy",
]
extras_require = {"format" : ["rfc3987", "strict-rfc3339", "webcolors"]}
if sys.version_info[:2] == (2, 6):
install_requires = ["argparse", "repoze.lru"]
elif sys.version_info[:2] == (2, 7):
install_requires = ["functools32"]
else:
install_requires = []
setup(
name="jsonschema",
version=__version__,
packages=["jsonschema", "jsonschema.tests"],
package_data={"jsonschema": ["schemas/*.json"]},
setup_requires=["vcversioner"],
install_requires=install_requires,
extras_require=extras_require,
author="Julian Berman",
author_email="Julian@GrayVines.com",
classifiers=classifiers,
@ -34,4 +46,5 @@ setup(
long_description=long_description,
url="http://github.com/Julian/jsonschema",
entry_points={"console_scripts": ["jsonschema = jsonschema.cli:main"]},
vcversioner={"version_module_paths" : ["jsonschema/_version.py"]},
)

33
tox.ini
View File

@ -3,20 +3,21 @@ envlist = py{26,27,34,py,py3}, docs, style
[testenv]
# by default tox runs with --pre which tickles this bug:
# https://bitbucket.org/pypy/pypy/issue/1894/keyerror-core-dumped-on-unicode-triple
install_command = pip install {opts} {packages}
changedir = {envtmpdir}
setenv =
JSON_SCHEMA_TEST_SUITE = {toxinidir}/json
commands =
py.test [] jsonschema
{envpython} -m doctest README.rst
py{26,27,34,py}: sphinx-build -b doctest docs {envtmpdir}/html
deps =
pytest
strict-rfc3339
webcolors
py{27,34,py,py3}: rfc3987
py26: trial [] jsonschema
py{27,34,py,py3}: green [] jsonschema
{envpython} -m doctest {toxinidir}/README.rst
py{26,27,34}: sphinx-build -b doctest {toxinidir}/docs {envtmpdir}/html
deps =
-e{toxinidir}[format]
py26: twisted
py{27,34,py,py3}: green
py26: argparse
py26: unittest2
py{26,27,py,py3}: mock
@ -27,7 +28,7 @@ deps =
[testenv:coverage]
commands =
coverage run --branch --source jsonschema [] {envbindir}/py.test jsonschema
coverage run --branch --source {toxinidir}/jsonschema [] {envbindir}/green jsonschema
coverage report --show-missing
coverage html
deps =
@ -46,12 +47,8 @@ commands =
[testenv:style]
deps = flake8
commands =
flake8 [] --max-complexity 10 jsonschema
flake8 [] --max-complexity 10 {toxinidir}/jsonschema
[flake8]
ignore = E203,E302,E303,E701,F811
[pytest]
addopts = -r s -s

1
version.txt Normal file
View File

@ -0,0 +1 @@
v2.3.0-133-g8ebd5bc