A Rust/PyO3 accelerator for jsonpath-ng — a native parser
that is iso-functional with jsonpath_ng and ~500× faster at parsing, with transparent
fallback to jsonpath-ng for anything outside its fast path. It's a separate package that depends
on jsonpath-ng: it rebuilds jsonpath-ng's own AST and reuses its evaluator, so it never diverges — by
construction.
pip install pulse-jsonpath-ng# was: from jsonpath_ng import parse
from pulse_jsonpath_ng import parse
expr = parse("items[*].price") # parsed in Rust, ~500× faster
[m.value for m in expr.find(data)] # .find() is jsonpath_ng's own — identical resultsThe only change is the import. parse() returns a real jsonpath_ng expression (same classes), so
.find(), .full_path, __eq__, and the exceptions all work unchanged.
jsonpath_ng.parse is the hot spot: it's built on PLY with write_tables=0, so it rebuilds the LALR
parse table (and the lexer table) on every single call — ~2.5 ms to parse a 3-segment expression,
roughly 300× the cost of the find it precedes. pulse-jsonpath-ng replaces only the parser with a
hand-written recursive-descent parser in Rust.
median / parse() |
speedup | |
|---|---|---|
jsonpath_ng.parse |
~2.5 ms | — |
pulse_jsonpath_ng.parse |
~5 µs | ~×500 |
import statistics, time, jsonpath_ng, pulse_jsonpath_ng
EXPR = "items[*].price"
def bench(parse, reps, rounds=15):
out = []
for _ in range(rounds):
t = time.perf_counter()
for _ in range(reps): parse(EXPR)
out.append((time.perf_counter() - t) / reps)
return statistics.median(out)
r = bench(jsonpath_ng.parse, 200); c = bench(pulse_jsonpath_ng.parse, 20000)
print(f"jsonpath_ng {r*1e6:.0f} us -> pulse {c*1e6:.2f} us (x{r/c:.0f})")If you already cache compiled expressions you won't see this — but a great deal of code calls
jsonpath_ng.parse(...) inline on every request, and that is exactly where the per-call table rebuild
hurts.
Profiling a parse(): ~100 % of the time is interpreted Python — PLY rebuilding lr_parse_table and the
lexer table — with re accounting for ~0.2 % (the lexer regex, built once). Because the bottleneck is
interpreted table-construction (not a C-bound kernel), a native parser wins by orders of magnitude. We
port only the parser; find stays in jsonpath_ng (reused, so evaluation is iso for free).
The Rust fast-path covers the common ./[...] expressions, left-associative:
- fields
a.b.c, wildcard*/a.*, numeric fieldsa.5, root$,`this`; - brackets
[0],[-1],[0,2], slices[1:3]/[::2]/[*], field brackets['a','b']; - quoted fields
'a.b', dotted-bracket equivalencea.[0]≡a[0].
Everything else is transparently delegated to jsonpath_ng: descendants .., |/& (union/
intersect), where/wherenot, parent, parentheses, named operators other than this, unicode
identifiers, and any invalid or ambiguous input (jsonpath_ng then returns the correct tree or raises the
correct JsonPathParserError/JsonPathLexerError). The fast-path is a strict subset — it never
accepts something it can't reproduce exactly.
The Rust parser rebuilds the same jsonpath_ng AST classes, so iso is verifiable two ways: the parsed
tree is structurally == to jsonpath_ng.parse(expr), and .find() (jsonpath_ng's own) returns
identical values and paths. Proven by a differential oracle over a curated corpus plus adversarial
fuzzing of random expressions — tree equality, .find() parity, and exception-type parity for invalid
input. The pure-jsonpath_ng fallback path is verified iso too (PULSE_FORCE_FALLBACK=1).
abi3 wheels (Python ≥ 3.11) for Linux (x86_64/aarch64, manylinux + musllinux), macOS (Apple Silicon),
and Windows; sdist elsewhere (builds the Rust core via maturin).
Apache-2.0 (same as jsonpath-ng).