Jaccs - JSON Access Strings

Jaccs allows you to pull data from JSON objects using string Python expressions. For example:

>>> from jaccs import access
>>> access({'a': 'b': [{'c': 1}, {'c': 2}]}, '_.a.b[1].c')
2

This is expected to be most useful for combining specifications stored in external text files (say JSON or YAML) with data objects in Python.

Installation

The best way to install Jaccs is via PyPI:

pip install jaccs

For the latest you can install from Github master with:

pip install https://github.com/jiffyclub/jaccs/archive/master.zip

Source code is on Github at https://github.com/jiffyclub/jaccs.

Access String Syntax

Access strings must be valid Python expressions, and the result returned by access() will be the result of calling eval on the expression. The JSON object is represented by an underscore in the expression. Jaccs wrapps the JSON object so that dictionary keys can be accessed via attributes:

>>> access({'a': {'b': 'z'}}, '_.a.b')
'z'

List items can be accessed via indexing:

>>> access(['x', 'y', 'z'], '_[0]')
'x'

Not all dictionary keys are valid attribute names so dictionary values can also be accessed with the usual brackets:

>>> access({'a b c': 'x y z'}, '_["a b c"]')
'x y z'

Expressions need not be limited to data access, they can access Python’s builtin functions and build their own arbitrarily complex objects. If for some reason you need access to the wrapped JSON object use the ._ attribute:

>>> access({'data': [1, 2, 3]}, '{"data": _.data._, "sum": sum(_.data._)}')
{'data': [1, 2, 3], 'sum': 6}

API

access

access is the simplest way to combine a JSON object and a string specification for data access. The use_default and default parameters can be used to return a default value when a key or array index is not found within the JSON object. Unless use_default is specified KeyError or IndexError is raised for missing keys/indices.

jaccs.access(dict_or_list, expr, use_default=False, default=None)

Retrieve whatever is at some path into a JSON object.

Parameters:
  • dict_or_list (sequence or mapping) – The object from which to retrieve data.
  • expr (str) – Access string describing path into dict_or_list. Can be any valid Python expression, use _ to represent dict_or_list.
  • use_default (bool, optional) – If True, the value of default will be returned in instances when a KeyError or IndexError occurs while evaluating expr.
  • default (bool, optional) – Default value to return when use_default is True and a KeyError or IndexError occurs while evaluating expr.

Examples

>>> access({'a': {'b': [1, 2, 3]}}, '_.a.b[2]')
3
>>> access({'a': {'b': 1}}, '_.a.c', use_default=True, default=2)
2

access_factory

If you intend to use the same access string repeatedly on a sequence of JSON objects it can be more performant to parse and compile the expression ahead of time. The access_factory function does that and returns a new function ready to process JSON objects according to the configuration passed to access_factory.

jaccs.access_factory(expr, use_default=False, default=None)

Create a function that will take a JSON object and retrieve data according to the arguments here.

Parameters:
  • expr (str) – Access string describing path into a JSON object. Can be any valid Python expression, use _ to represent the JSON object.
  • use_default (bool, optional) – If True, the value of default will be returned in instances when a KeyError or IndexError occurs while evaluating expr.
  • default (bool, optional) – Default value to return when use_default is True and a KeyError or IndexError occurs while evaluating expr.
Returns:

Takes a single argument of a JSON object and returns data from the object according to the arguments here.

Return type:

function

Examples

>>> access_ = access_factory('_.a.b[2]')
>>> access_({'a': {'b': [1, 2, 3]}})
3
>>> access_ = access_factory('_.a.c', use_default=True, default=2)
>>> access_({'a': {'b': 1}})
2

spec_to_records

spec_to_records allows you to stream a sequence of JSON objects against a dictionary of access strings (a spec), yielding dictionaries with the same keys as the spec holding values pulled from the JSON objects.

jaccs.spec_to_records(spec, seq_of_json)

Combine a specification of JSON access strings with a sequence of JSON objects to create a sequence of records.

Parameters:
  • spec (dict) – Dictionary in which keys are paired with dictionaries or strings specifying how to access an item from the JSON objects in seq_of_json. Dictionaries must have at least an expr key and may optionally have use_default and default keys. See the access documentation for descriptions of the values. If the dict values are strings they are used as the expr.
  • seq_of_json (iterable) – Any iterable of JSON objects.
Yields:

dict – One dictionary for each value in seq_of_json. Each will have the same keys as spec and values pulled from the JSON objects according to the expressions in spec.

Examples

>>> spec = {'b': '_.a.b', 'c': {'expr': '_.a.c', 'use_default': True, 'default': 4}}
>>> json = [{'a': {'b': 1, 'c': 2}}, {'a': {'b': 3}}]
>>> list(spec_to_records(spec, json))
[{'c': 2, 'b': 1}, {'c': 4, 'b': 3}]

Dots

Dots is a class used internally by Jaccs to translate attribute lookups into normal key-based access on dictionaries. It’s the workhorse used when evaluating JSON access strings. It’s exposed here in case folks have a case for using it to write less verbose dictionary access code and don’t want to use any of the functions documented above.

class jaccs.Dots(data)

Wraps dictionaries or lists to implement attribute access for dict keys.

Scalar-like objects (strings, numbers, etc) are returned as is, but nested sequences/mappings are wrapped again to support further attribute-style access. Use the _ attribute to access the raw wrapped Python object.

Parameters:data (mapping or sequence) – Data object to wrap for access.

Example

>>> d = Dots({'a': {'b': [1, 2, 3]}})
>>> d.a.b[1]
2
>>> d.a
Dots({'b': [1, 2, 3]})
>>> d.a._
{'b': [1, 2, 3]}
_

Access the wrapped Python object.