A Rust/PyO3 accelerator for cerberus — an iso-functional,
drop-in Validator whose validate() hot path runs in native code, with transparent fallback to
cerberus for anything outside its fast path. It's a separate package that depends on cerberus
and delegates back to it, so it never diverges on input cerberus accepts — by construction.
pip install pulse-cerberus# was: from cerberus import Validator
from pulse_cerberus import Validator
schema = {
"id": {"type": "integer", "required": True, "min": 1},
"name": {"type": "string", "required": True, "minlength": 1, "maxlength": 64},
"role": {"type": "string", "allowed": ["admin", "user", "guest"]},
"tags": {"type": "list", "schema": {"type": "string"}},
"addr": {"type": "dict", "schema": {"city": {"type": "string"}}},
}
v = Validator(schema) # compiled once
v.validate({"id": 1, "name": "alice", "role": "admin"}) # → True, ~200× faster
v.errors # same .errors tree as cerberus, byte-for-byteThe only change is the import. Validator, .validate(), .validated(), .errors, .document,
SchemaError, DocumentError, registries — the cerberus API works unchanged.
cerberus's Validator.validate is interpreted Python: per call it re-dispatches every rule, spawns
child-validators for nested schemas, and re-expands the schema. pulse-cerberus compiles the schema
once into a Rust rule-AST and then validates flatly.
Drift-immune A/B (median per call), pre-built validator, realistic schema (8 fields incl. a nested dict and a list), CPython 3.11, Apple Silicon:
median / validate() |
speedup | |
|---|---|---|
cerberus.Validator.validate |
~245 µs | — |
pulse_cerberus.Validator.validate |
~1.2 µs | ~×200 |
The striking part: cerberus pays a large fixed per-call cost (rule dispatch + child-validator spawning
- schema re-expansion) on every
validate(), even for a small document. That interpreted machinery is exactly what a native validator removes. Reproduce it:
import statistics, time, cerberus, pulse_cerberus
schema = {"id": {"type": "integer", "required": True, "min": 1},
"name": {"type": "string", "required": True, "minlength": 1, "maxlength": 64},
"role": {"type": "string", "allowed": ["admin", "user", "guest"]},
"tags": {"type": "list", "schema": {"type": "string"}},
"addr": {"type": "dict", "schema": {"city": {"type": "string"}}}}
doc = {"id": 1, "name": "alice", "role": "admin", "tags": ["x"], "addr": {"city": "Paris"}}
ref = cerberus.Validator(schema) # build once (validators are meant to be reused)
cand = pulse_cerberus.Validator(schema)
def bench(v, reps, rounds=15):
out = []
for _ in range(rounds):
t = time.perf_counter()
for _ in range(reps): v.validate(doc)
out.append((time.perf_counter() - t) / reps)
return statistics.median(out)
r = bench(ref, 2000); c = bench(cand, 20000)
print(f"cerberus {r*1e6:.1f} us -> pulse {c*1e6:.2f} us (x{r/c:.0f})")Profiling validate(): the hot path is interpreted Python (__validate_definitions /
__get_rule_handler per rule×field, child-validator spawning, schema re-expansion), 0 % in the C
re engine. The ~38 % that shows up as "C" is isinstance / abc.__instancecheck__ / dict.get —
dispatch glue that simply evaporates in typed Rust. Because the bottleneck is interpreted Python (not a
C-bound kernel), a native rewrite wins by orders of magnitude.
pulse-cerberus validates natively when the schema uses only:
- types
integer/float/number/boolean/string/dict/list(with cerberus's exact bool semantics:integer/floatacceptTrue,numberexcludes it); - rules
required,allowed,min,max,minlength,maxlength,empty,nullable, and nestedschema(dict + list-of), withallow_unknownandrequire_all.
Everything else is transparently delegated to cerberus: normalization (coerce/default/rename/
purge_unknown/readonly), logic (*of, dependencies, excludes), check_with, keysrules/
valuesrules, items, contains, regex, allow_unknown as a rules-set, registries, custom
Validator subclasses, non-dict documents, and exotic value types. A SchemaError is raised at
construction exactly as cerberus would. When in doubt, it falls back — it never guesses.
The error messages are rendered through Python's
str()(cerberus formats its own messages the same way), so the.errorstree — structure, messages, and per-field alphabetical-by-rule order — is identical to cerberus's.
Proven by a typed differential oracle comparing (validate(), .errors, .document) against stock
cerberus, on a curated corpus of the iso-critical cases plus adversarial fuzzing of random
(schema, doc) pairs — including the bool subtleties, error ordering, nested/list-of error trees, the
empty/min interaction, and exception parity (SchemaError/DocumentError). The pure-Python
fallback path is verified to be iso too (PULSE_FORCE_FALLBACK=1).
abi3 wheels (Python ≥ 3.11) for Linux (x86_64/aarch64, manylinux + musllinux), macOS (Apple Silicon),
and Windows; sdist elsewhere (builds the Rust core via maturin).
ISC (same as cerberus).