feat(stark): single-source constraints — one definition per constraint, verified cross-version [stacked on #763] by MauroToscano · Pull Request #764 · yetanotherco/lambda_vm

MauroToscano · 2026-07-02T12:54:47Z

The proof, first

A cli built from before this change — the old boxed-constraint machinery — verifies proofs produced by the prover after this change, and vice versa, for real VM executions:

scripts/cross_verify_vm.sh 2499b2a8 e76807ee
PASS prove-NEW-verify-OLD : sub      PASS prove-OLD-verify-NEW : sub
PASS prove-NEW-verify-OLD : add      PASS prove-OLD-verify-NEW : add
PASS prove-NEW-verify-OLD : arith_8  PASS prove-OLD-verify-NEW : arith_8

Because the verifier recomputes the OOD constraint evaluations from its own constraint definitions against the other side's commitments, this passing in both directions means the constraint system is semantically unchanged: same polynomials, same ordering, same degrees, same transcript. The same exchange passes for the 11 example AIRs (22/22, scripts/cross_verify_examples.sh). Evidence logs live on the frozen branch reference/sscs-cross-verify-evidence (run at three checkpoints: pre-deletion, post-deletion, final tip). Reproduce any of it with the two scripts — they stay in-tree as permanent regression gates.

What this PR is

Every transition constraint is now defined once — one eval body per table — and interpreted three ways: compiled prover evaluation per LDE row, compiled verifier evaluation at the OOD point (the identical code path the recursion guest runs: no hashing, no interpretation in-circuit), and a captured flat IR (air.constraint_program()) for the upcoming GPU constraint interpreter.

Net −6,586 lines. Deleted: the TransitionConstraintEvaluator/adapter/boxed() machinery, every duplicated per-constraint struct and builder function, the old LogUp constraint objects — plus two features with no remaining users (approved): periodic columns and the periodic-exemptions zerofier apparatus (end_exemptions stays; re-adding any of it later costs no constraint rewrites — zerofier metadata is orthogonal to the bodies).

Shape of the result

Tables: XxxConstraints: ConstraintSet in each table file — meta() (index-ordered {idx, kind, degree, end_exemptions}) + one eval<B: ConstraintBuilder> body. See prover/src/tables/eq.rs (small), dvrm.rs/ecdas.rs (large), crypto/stark/src/examples/simple_fibonacci.rs (minimal).
LogUp: LogUpLayout (data) + emit_logup_constraints/logup_meta (framework) replace the constraint objects; same split_interactions, same degrees, same indices.
AirWithBuses<F, E, B, PI, CS> holds the constraint set + layout + meta; AIR::constraint_program() is lazy and OnceLock-cached — the verify path never captures.
Zerofiers: 1/(xᴺ−1) with optional end-exemption roots, driven by meta through free functions.
Prover frames: the evaluator borrows each LDE row in place (RowFrame — two slice pairs straight into the row-major storage) instead of gather-copying ~150–200 elements per row per table into an owned frame.

Verification (beyond cross-version)

Every constraint body was migrated with a 1000-random-row three-way differential against its old implementation (prover folder / verifier folder / capture→interpret) plus meta parity (count, num_base, degrees, zerofier params) — the old code stayed in-tree as the oracle until all gates passed, then was deleted in one commit.
Permanent tests retained: folder-vs-captured-IR consistency per table, LogUp layout coverage (1-/2-absorbed, absorbed-only, all 10 packings), emission-completeness and meta-invariant assertions.
Interpreter path: every production AIR's captured program (via air.constraint_program()) is interpreted and compared bit-for-bit against the compiled folders on random frames — all 26 AIRs, prover and verifier sides (constraint_program_tests.rs). Exact-once emission is asserted release-safely in every differential harness (the debug-only tracker compiles out of CI's --release test build; a skipped constraint would otherwise become silently unenforced).
Suites: cargo test -p stark 169/169; prover release suite green modulo 5 pre-existing failures (missing test_commit_split.elf fixture, fails on main too). fmt/clippy clean; zero unsafe in the new modules.

Design record

thoughts/gpu-constraint-eval/impl-plan-single-source-constraints.md (in-tree) is the implementation plan with the full decision log; survey-constraint-frontends.md documents how Plonky3/OpenVM/SP1/risc0/zisk/airbender solve the same problem and why this design follows the production-validated pattern (compiled folders for CPU/guest, captured IR for GPU).

Performance

Measured with 32-pair interleaved ABBA runs on the bench server (drift-free paired comparison, ethrex-20). The migration initially cost +4.0% prove time; profiling-by-elimination traced all of it to memory behavior, none to field arithmetic:

per-row Vec allocations in the LogUp fingerprint fold — fixed (fold-style accumulator);
per-row Vec allocations in AddOperand construction (CPU ×4/row, KECCAK ~150/row) — fixed (inline AddTerms storage);
the per-row frame gather-copy — fixed (RowFrame in-place borrows, the one change with a measurable win, ~0.8%).

Arithmetic micro-optimizations (subexpression hoists, negation elimination, β⁰ skip) were each tried, measured flat, and reverted — the constraint bodies stay in their plain declarative form, with comments marking the known-and-measured redundancies. Final verdict: −0.70% vs main (CI [−1.15, −0.25]) — accepted; the GPU phase replaces this CPU path, and the interpreter-as-prover alternative measured −9% on the same bench.

Pre-review sweep

Four independent review passes (correctness edge cases, test-coverage gaps, stale docs, general design) ran over the full diff before human review:

Correctness: no live defects. The one latent finding — exact-once emission had no release-safe gate — is fixed here (see Verification).
Coverage: the interpreter path (the GPU contract) was only tested on synthetic shapes; the all-26-AIR combined-program differential closes that.
Docs: ~60 stale comments from the migration (dangling references to deleted structs, wrong counts) rewritten to describe the current code only.
Design: dead surface removed (packing_shifts context field, the always-true ConstraintProgram::complete flag).

New module crypto/stark/src/constraint_ir/ providing a flat, topologically ordered IR for transition constraints, generic over a field tower <F: IsSubFieldOf<E>, E> (defaulting to Goldilocks and its degree-3 extension): - ir.rs: Op/Dim/ConstraintProgram<F, E>. Constants live in base_consts/ ext_consts side tables referenced by index (Op::ConstBase(u32)/ Op::ConstExt(u32)), keeping Op a plain Copy+Eq+Hash payload with zero bounds on the fields. - builder.rs: IrBuilder<F, E> with (Op, Dim) hash-consing, by-value constant dedup via linear scan (FieldElement's canonicalizing PartialEq), and the id-0 = base-zero convention. - interp.rs: forward-pass interpreter; eval_program / eval_program_verifier match the AIR compute_transition_prover / compute_transition contracts, eval_program_base is the minimal single-root entry point. Const ops read the side tables directly. Not wired into the prover or verifier; no behavior change.

Hand-built IrBuilder programs checked against direct FieldElement arithmetic: every Op variant and leaf kind, mixed base/ext arithmetic with auto-embed, constant dedup (base, signed, ext), the id-0 zero convention, CSE sharing, out-of-order emit() root indexing, the complete-flag plumbing, and 1000-row randomized differential checks. The prover and verifier entry points are exercised against hand-constructed TransitionEvaluationContext values (both variants), including the base-root promotion on the verifier side and next-row (offset 1) frame reads. A reflexive-tower test (E = F over the u32 test field, which has a different modulus and BaseType than Goldilocks) proves the module is genuinely field-generic; math's test-utils feature is enabled as a dev-dependency for it.

New module crypto/stark/src/constraints/builder.rs: one constraint body, written once against the ConstraintBuilder trait, is interpreted three ways depending on the implementation it runs over: - ProverEvalFolder<F, E> (Expr = FieldElement<F>): compiled per-row prover evaluation, constructed from the Prover TransitionEvaluationContext variant plus output slices; emit_ext writes ext_evals at the absolute constraint index. - VerifierEvalFolder<F, E> (Expr = FieldElement<E>): the same body at the OOD point over the all-extension frame; const_base embeds via FieldElement::<F>::from(v).to_extension(). Monomorphized into the guest binary this is the recursion path — no capture, no hashing, no interpretation. - CaptureBuilder<F, E> (Expr = owned Rc expression tree with eager dim + degree): one setup-time run that flattens into the constraint IR's IrBuilder (hash-consing there = structural CSE) and returns the program plus per-root tree-measured degrees. Also: ExprOps/ExtExprOps operator-bound aliases (mixed base/ext ops keep the base operand on the left, matching the field tower), ConstraintMeta + RootKind plain-data constraint metadata with the dense/idx-ordered/Base-prefix invariants (num_base_from_meta), the ConstraintSet per-table trait, and debug-build emit tracking asserting every constraint index is emitted exactly once. Tests: a sample ConstraintSet (EqXor-, IsBit- and Add-carry-pair-shaped bodies plus a LogUp-shaped extension constraint) checked on 1000 random rows three ways — prover folder vs direct arithmetic, prover folder vs interpreted capture, verifier folder vs interpreted capture — plus measured-vs-declared degrees, meta invariants, and the completeness asserts. Not wired into any production path; no behavior change.

New module crypto/stark/src/constraints/zerofier.rs: the bodies of the TransitionConstraintEvaluator default methods (end_exemptions_roots, end_exemptions_lde_evaluations, zerofier_evaluations_on_extended_domain, evaluate_zerofier) relocated verbatim to free functions consuming plain ConstraintMeta — they only ever read the metadata getters. The trait defaults remain untouched and stay the production path until tables convert to ConstraintSet. Tests assert the free functions match the trait defaults bit-for-bit on a configurable boxed constraint across every branch: the default shape, end exemptions, period/offset combinations, and periodic exemptions with and without end exemptions, over both the LDE-domain and OOD entry points.

…t front-end survey

Two additions to the builder framework tests, per the review pre-flight: - num_base alignment guard (release-checked): a counting capture wrapper records which emit_* sink the sample body calls per index and asserts the base-emit set is exactly the num_base_from_meta prefix, and that every captured root's dim matches the interpreter's c < num_base routing (the .as_base() panic condition). num_base has independent sources of truth in meta, the folders and finish(num_base) — this pins them together. - next-row + multi-alpha differential: a LogUp-accumulator-shaped sample set reading aux(1, col) (next-row accumulator) and two distinct alpha powers — both used by the real 1-/2-absorbed LogUp bodies and covered by neither existing sample — checked three ways on 1000 random two-step frames (prover folder vs direct arithmetic vs interpreted capture; verifier folder vs interpreted capture).

Non-destructive emit_*/*_meta function pairs in constraints/{templates, cpu}.rs, written once against the generic ConstraintBuilder so one body serves the compiled prover folder, the verifier folder and IR capture: - templates: emit_is_bit (conditional + unconditional), emit_add_pair (the carry pair from ONE body, covering every AddOperand variant via shared operand/term helpers) - cpu: emit_product_zero, emit_arg2_exclusive, emit_mem_flags_bit, emit_reg_not_read_is_zero, emit_arg2, emit_rvd_eq_res, emit_branch_rvd_pair, emit_branch_cond, emit_next_pc_add_pair (the two pc + instruction_length carry pairs share one gated body) Each *_meta returns the idx-ordered ConstraintMeta (declared degree, default zerofier shape — none of these constraints override period/ offset/exemptions, matching the structs). The old boxed constraint structs stay untouched: they are the differential oracle for the transcription until the table conversion deletes them.

The transcription gate for the constraints switch: every emit_* body is checked against the old boxed struct's evaluate on 1000 random rows (off-trace points, where a weakened or slipped transcription diverges with overwhelming probability), three ways per constraint: - ProverEvalFolder output vs old evaluate::<Gl, Gl3> - VerifierEvalFolder output vs old evaluate::<Gl3, Gl3> - CaptureBuilder -> flatten -> interpret vs old evaluate::<Gl, Gl3> plus tree-measured degree == declared meta degree == old degree(), and *_meta zerofier parameters == the old struct's period/offset/ exemptions_period/periodic_exemptions_offset/end_exemptions. Covers all eleven kinds; the ADD pair runs three configurations (conditional dword, unconditional DWordHL + negative-coefficient linear, multi-column condition with Word/Constant/DWordBL operands) so every AddOperand variant and both const_signed signs are exercised.

… by design); adopt cross-version verification + random-row differentials

Add EqConstraints / StoreConstraints implementing ConstraintSet, mirroring the old boxed builders index-for-index. New differential test module constraint_set_tests_b compares old evaluate_prover/evaluate_verifier and capture->interpret against the new single body on 1000 random rows, plus meta parity and measured-vs-declared degree.

…test

…ConstraintSet Add MemwConstraints (15), MemwAlignedConstraints (8), MemwRegisterConstraints (3) mirroring the old boxed builders index-for-index; extend constraint_set_tests_b with 1000-row differential + meta-parity checks.

…intSet Add BranchConstraints (5, carry-bit next-pc pairs + JALR is-bit) and CommitConstraints (8, is-bit + first/end=>mu + two ADD carry pairs) mirroring the old boxed builders index-for-index; extend constraint_set_tests_b with 1000-row differential + meta-parity checks.

…straintSet Add KeccakConstraints (51: 25 mu-gated ADD carry pairs for state_ptr lanes + top-lane no-overflow) and KeccakRndConstraints (20 mu-gated is-bit on Cxz_right), mirroring the old boxed builders index-for-index; extend constraint_set_tests_b with 1000-row differential + meta-parity checks.

Add Cpu32Constraints (32: is-bit flags, ADD/SUB carry pairs, register-zero limbs, sign-extension arithmetic, sign-zero/arg2-exclusive/flag=>mu), unrolling the old multi-kind cpu32_constraints assembly into straight-line emit calls; extend constraint_set_tests_b with 1000-row differential + meta-parity checks.

…r all 8 tables

…ppy)

…-switch # Conflicts: # prover/src/tests/mod.rs

These two examples were the sole users of sub-row/virtual-column reads and periodic columns, both abandoned features. The framework machinery (Op::Periodic, periodic zerofier paths) stays; it is removed together with the engine machinery in a later phase.

A cargo example target (requires the test-utils feature) that proves and verifies each example AIR with bincode-serialized proofs, mirroring bin/cli's VM-proof format. Trace sizes and public inputs mirror the existing stark tests, so a proof produced by one version of the constraint system can be checked by another — this binary is the old-verifier side of scripts/cross_verify_examples.sh. Public-input structs gain serde derives (with an explicit FieldElement bound) so the proofs, which embed them, can round-trip.

Shared plumbing for routing an AIR's compute_transition_prover / compute_transition through a ConstraintSet body via the folders. The verifier-flavored helper also accepts a Prover context (debug trace validation calls compute_transition with a prover frame) by running the prover folder and promoting the Base prefix, mirroring the old boxed path's evaluate_verifier promotion. The engine switch reuses these.

…y_air to ConstraintSet Each example gains a single-body ConstraintSet transcribing its old evaluate_verifier text, and routes compute_transition_prover / compute_transition through the folders via the run_transition_* helpers; num_base_transition_constraints comes from num_base_from_meta. Old TransitionConstraintEvaluator impls stay (zerofier machinery + deletion in a later phase). Meta preserves indexing, degrees, and end exemptions (simple_fibonacci 2, quadratic 1, dummy fib 2 / bit 0, simple_addition 0).

…o ConstraintSet Same recipe: single-body ConstraintSet transcribed from the old evaluate_verifier text, compute_transition_prover / compute_transition routed through the folders, num_base from num_base_from_meta. All three read the next row (fibonacci_multi_column reads two next rows); end exemptions copied (1, 1, and 2 per column respectively). The multi-column set is parameterized by num_columns, one constraint per column with idx == column, exactly as the old per-column structs.

The RAP/LogUp trio: continuity and single-value stay Base constraints; the permutation/LogUp constraints read the auxiliary column and the interaction challenges, so they are Ext constraints after the Base prefix (num_base = 1 resp. 2 via num_base_from_meta, overriding the old all-ext default — value-neutral: the engine's F-vs-E accumulation split changes, the composition polynomial does not). Degrees and end exemptions copied from the old structs, including fibonacci_rap's steps=16-hard-coded fib end exemptions and the degree-3 LogUp term. multi_table_lookup has no example-level constraints (all LogUp, framework-generated); noted in the file — it converts with AirWithBuses in the engine-switch phase.

…fier fns The two period=1 cases now exercise zerofier::end_exemptions_roots directly with a ConstraintMeta. The nonzero-offset case exists purely to exercise the period != 1 zerofier shape, which no production constraint uses; it stays on the old trait path untouched and dies with that machinery in the final deletion phase (noted inline).

…evidence scripts/cross_verify_examples.sh builds examples_cli at two refs (bench_abba-style isolated worktree) and, per example AIR, checks prove NEW -> verify OLD and prove OLD -> verify NEW. The verifier recomputes OOD constraint evaluations from its own definitions, so any constraint (re)ordering, num_base drift, indexing or semantic change in the migration fails loudly — no proof determinism needed. cross_verify_examples.log: run with OLD=88adbfa6 (pre-migration examples-cli) and NEW=f2d34efd (all examples on ConstraintSet): all 11 examples pass in both directions.

…traints + logup_meta) Generate the LogUp transition constraints from the interaction config through the generic ConstraintBuilder, so one body serves the compiled prover folder, the verifier folder and IR capture. LogUpLayout captures what AirWithBuses::new computes (committed pairs, absorbed interactions, term/acc column indices); emit_logup_constraints emits the batched-term and accumulated constraints (1- and 2-absorbed branches, aux(1,.) next-row reads); logup_meta reproduces the boxed structs' degree/zerofier answers (all RootKind::Ext, default shape). The fingerprint/multiplicity/packing capture helpers are ported from the spike branch's IrBuilder-shaped helpers to the generic B: ConstraintBuilder API (operator style, base operand LEFT for mixed base x ext ops). Differential test (logup_single_source_tests) compares the OLD boxed LogUp structs vs the new emit fns via ProverEvalFolder, VerifierEvalFolder and capture->interpret on 1000 random two-step frames, for 1-absorbed, 2-absorbed, absorbed-only, and every Packing variant.

The CPU table's transition constraints are assembled by create_all_cpu_constraints in prover/src/constraints/cpu.rs (never converted — P1 covered only prover/src/tables/*.rs). Add CpuConstraints: ConstraintSet built from the existing emit_*/*_meta fns in that file, in the same order as the old assembly (39 constraints, all base-field, idx 0..38). Differential test (constraint_set_tests_b::cpu) compares it against the old boxed create_all_cpu_constraints assembly: count / num_base / per-idx degree / zerofier params, plus the 1000-row three-way folder-vs-interpreter differential.

All value-identical; each item is independently revertable if the bench says it isn't worth its cleverness: - IsSubFieldOf gains sub_from (ext - base) with a neg(sub) default; Goldilocks ext2/ext3 specialize it to touch only component 0. sub_subfield routes through it (was 1 sub + 5 negs, now 1 sub). - ConstraintBuilder gains ext_sub_base with the -(v - e) default (capture IR unchanged); the eval folders override via sub_subfield. Used for the fingerprint seed z - bus_id and the 1-absorbed accumulated root. - LogUp sender/receiver signs resolve as sub-vs-add instead of negating the receiver term (x - (-t) = x + t): no ext negation per term. - Evaluator uniform fold seeds with constraint 0's promoted evaluation (transition_coefficients[0] is beta^0 = 1 by construction) - skips a multiply-by-one per row without branching on the value. - Shared-subexpression hoists: LOAD sign-extension product (was built 7x per row), MUL sign-fills (4x), LT carry_0/carry_1 (3x/2x), BRANCH next-pc repack + per-path carry_0 (4x/2x).

MauroToscano · 2026-07-02T19:25:30Z

/bench-abba 32

github-actions · 2026-07-02T19:26:46Z

⏳ ABBA tiebreaker started on the bench server (~30–40 min). The bench server is occupied until it finishes.

github-actions · 2026-07-02T19:46:50Z

ABBA tiebreaker — `73d33d4ea5` vs `main` (32 pairs)

=== ABBA paired result  (improvement: + = PR faster) ===
  pairs: 32   mean A (PR): 17.015s   mean B (base): 16.793s

  [parametric] paired-t   mean -1.33%   sd 1.52%   se 0.27%
               95% CI: [-1.88%, -0.78%]   (t df=31 = 2.042)
  [robust]     median -1.34%   Wilcoxon W+=53 W-=475  p(exact)=2.0e-05  (z=-3.94)

  --- server stability (this run; compare across servers) ---
  run-to-run jitter:    A CV 1.00%   B CV 0.85%        (lower = steadier)
  within-session drift: -0.43% over the run, 1st->2nd half -0.33%
    (jitter -> Tier-1 cached gate floor; drift -> whether the cached baseline can be trusted)

  VERDICT: REAL REGRESSION - PR slower by ~1.33% (t-CI and Wilcoxon agree)

  raw pairs: /tmp/abba_run/pairs.csv

_{Drift-free interleaved A/B/B/A measurement. + = PR faster. Trust the verdict when paired-t and Wilcoxon agree.}

The LOAD/MUL/LT/BRANCH hoists from the micro-op bundle measured flat on ABBA (-1.33% vs the pre-bundle -1.5%, within noise), so the bodies go back to their declarative per-emit form -- the constraint bodies double as the spec, and cleverness there has to earn its keep. Each site keeps a comment recording that the redundancy is known and was measured to not matter, so it doesn't get re-optimized. The engine-side pieces of the bundle (ext_sub_base, sender-sign sub-vs-add, beta^0 seed) are body-invisible and stay.

Drops ext_sub_base (builder primitive + IsSubFieldOf::sub_from + the Goldilocks specializations), the sender-sign sub-vs-add rewrite, and the beta^0-seeded evaluator fold. All measured flat on ABBA like the body hoists, and each adds trait surface or non-local invariants the straightforward form doesn't need. The branch's constraint plumbing is back to the af8a23a state; the only bundle survivors are the known-redundancy comments in the table bodies.

The LDE buffers have been row-major since the row-major LDE rework, but the evaluator still gather-copied every main and aux column of every transition offset into an owned Frame on each LDE point (~150-200 element clones per row per table) before the constraint body ran. Replace the Prover context's frame with RowFrame: one borrowed (main, aux) row-slice pair per transition offset, taken straight from the row-major storage with the same cyclic row arithmetic. The folder and the IR interpreter read rows[offset][col] directly; the per-thread preallocated Frame and fill_from_lde are deleted (single-row steps only - the sole shape since virtual columns were removed; asserted). Frame stays for the verifier/OOD path and debug validation, which bridges via Frame::as_row_frame. Reads the same values from the same memory - proofs are unchanged.

MauroToscano · 2026-07-02T20:07:43Z

/bench-abba 32

github-actions · 2026-07-02T20:07:52Z

⏳ ABBA tiebreaker started on the bench server (~30–40 min). The bench server is occupied until it finishes.

Found by clippy::redundant_clone sweeping for pointless clones; the only hit in production code (setup-time, cosmetic). The remaining test/example hits are left as-is.

github-actions · 2026-07-02T20:27:51Z

ABBA tiebreaker — `3638d7a1c7` vs `main` (32 pairs)

=== ABBA paired result  (improvement: + = PR faster) ===
  pairs: 32   mean A (PR): 16.903s   mean B (base): 16.787s

  [parametric] paired-t   mean -0.70%   sd 1.25%   se 0.22%
               95% CI: [-1.15%, -0.25%]   (t df=31 = 2.042)
  [robust]     median -0.84%   Wilcoxon W+=118 W-=410  p(exact)=0.0054  (z=-2.72)

  --- server stability (this run; compare across servers) ---
  run-to-run jitter:    A CV 0.68%   B CV 0.98%        (lower = steadier)
  within-session drift: -0.23% over the run, 1st->2nd half -0.21%
    (jitter -> Tier-1 cached gate floor; drift -> whether the cached baseline can be trusted)

  VERDICT: REAL REGRESSION - PR slower by ~0.70% (t-CI and Wilcoxon agree)

  raw pairs: /tmp/abba_run/pairs.csv

_{Drift-free interleaved A/B/B/A measurement. + = PR faster. Trust the verdict when paired-t and Wilcoxon agree.}

Four-agent review sweep found ~60 stale comments left behind by the multi-phase migration: doc blocks claiming the deleted per-constraint structs still exist ('the old structs stay for now'), ~31 dangling rustdoc links to deleted symbols (Twin of X, matches X::eval), count headers that disagreed with the code (MEMW says 11 constraints, has 15; SHIFT lists 26 columns, has 29), and references to removed concepts (virtual columns, periodic tests, empty_constraints()). All comments now describe the current code in plain present tense -- constraint-index maps verified against meta()/eval(), bus names and counts recounted from bus_interactions(). No historical framing: migration provenance lives in git, not doc comments.

Both are write-only vestiges the design review flagged: - TransitionEvaluationContext::packing_shifts was constructed and passed at five sites and read at none -- the single-source bodies lower the packing shift constants through const_base, so the folders never touch it. The field, both constructor params, and three per-prove PackingShifts::new() constructions go away. The PackingShifts type stays: the aux-trace build path genuinely uses it. - ConstraintProgram::complete / IrBuilder::mark_unsupported encoded a fall-back-to-boxed-path protocol for partially captured AIRs; the boxed path no longer exists and every AIR captures fully, so the flag was always true and read only by tests asserting it's true. Also adds fail-loud asserts on the capture-time u8/u16 narrowing (offsets, columns, challenge/alpha indices) in IrBuilder and CaptureBuilder: capture runs once at setup, and a table wider than the IR encoding must panic rather than silently truncate into the GPU program.

…nce checks The test-gap review found the capture->IR->interpret pipeline (the future GPU path; no production caller today) was verified only on shapes production never uses: no real table's combined base+LogUp program was ever interpreted, multi-committed-pair layouts and five of the seven Multiplicity variants never flowed through the interpreter. - constraint_program_tests: every production AIR's captured program (via the production constraint_program() entry point) interpreted and compared bit-for-bit against the compiled folders on random two-step frames, prover and verifier sides -- all 26 AIRs. - logup_two_committed_pairs: >= 2-pair layout fixture, exercising the batched-term loop and the accumulated term-column sum past their first iteration. - RowFrame::from_lde unit tests: per-offset row borrows, the cyclic wrap at the domain end, the offsets cap, and as_row_frame equivalence. The correctness review found the exact-once-emission invariant had no release-safe gate: EmitTracker is debug-only and CI runs tests --release, so a double-emit/skip-swap typo would ship a silently unenforced (always-zero) constraint. Every differential harness now asserts the emitted index set is exactly 0..n in any build profile, and the per-table test rejects roots left at the id-0 sentinel.

- ConstraintSet doc: meta() and eval are parallel index walks that must agree entry for entry; say so where implementers read it. - IrBuilder::const_ext/embed: note both are unreachable from the single-body capture path and kept for IR completeness / GPU lowering. - emit_busvalue_fingerprint: state why only the Linear arm routes through the zero-skip hook. - zerofier: the end-exemption walk computed the same pow twice.

The raw cross-version verification logs (examples 22/22, VM 6/6, at the pre-deletion / post-deletion / final-tip checkpoints) are runtime artifacts, not source. They stay available on the frozen branch reference/sscs-cross-verify-evidence; this branch keeps only the scripts that regenerate them (scripts/cross_verify_*.sh).

The RowDomain refactor turned emit_base/emit_ext into provided defaults that forward to emit_base_rows/emit_ext_rows, adding a call hop on the per-constraint-per-row prover path (vs #764's direct folder emit). A paired ABBA vs #764 (12 pairs) showed a real ~1.1% regression: paired-t 95% CI [-1.99%, -0.27%], Wilcoxon W=10 (significant), 10/12 pairs slower. Marking the forwarding defaults and ProverEvalFolder's emit_*_rows #[inline] collapses the hop back to a direct slice write. Pure codegen hint — no value or wire change, so cross-verification stays green.

…#772) * refactor(stark): derive constraint metadata from the single eval body Metadata (kind, degree, end-exemptions) is now DERIVED from each table's `eval` body instead of hand-maintained in a parallel `meta()` method, so the two can no longer drift out of sync. - `ConstraintBuilder::emit_base/emit_ext` now take the constraint's `degree`; new `emit_*_exempt` variants also take `end_exemptions`. The prover and verifier folders ignore both (dead args -> no hot-path cost); only metadata derivation reads them. - `ConstraintSet::meta()` becomes a provided default that runs the body through a new no-op `MetaBuilder` (records {idx, kind, degree, end_exemptions}). Kind is implied by which sink is called; degree stays hand-declared, now at the emit site next to the expression it describes. - Every per-table `meta()` override, every `*_meta` helper twin, and `logup_meta` are deleted; LogUp metadata is derived from `emit_logup_constraints` the same way. Net -217 LoC across all tables + LogUp + the crypto/stark examples. Verified: workspace + all test targets compile; 169 stark tests and 96 prover constraint tests pass (including constraint_program_tests:: all_table_programs_match_folders and the per-table constraint counts); the capture->interpreter differential and declared-vs-measured-degree gates hold; `make lint` clean on all feature sets. * refactor(stark): split constraint degree (per-table) from row-domain Removes per-constraint `degree` and the positional-int emit args, replacing them with two orthogonal, self-describing concepts: - Degree: only the per-table MAX is consumed (by composition_poly_degree_bound), so it is declared once via `ConstraintSet::max_degree()` (default 2, overridden to 3 on the degree-3 tables) instead of on every emit call. `ConstraintMeta` drops its `degree` field; the bound now reads `max_degree().max(logup_max_degree(layout))`. - Row-domain: `emit_base_exempt(idx, degree, n, e)` becomes `emit_base_rows(idx, RowDomain::except_last(n), e)` — a named type, so the rare end-exemption reads in plain language and is no longer welded onto degree (three unlabeled ints -> one named argument). Only the crypto/stark example AIRs use it; every production table's emit is now `emit_base(idx, expr)`. The composition-poly bound stays byte-identical: each degree-3 table declares `max_degree()` == its former per-constraint max, and the framework folds in the LogUp max via `logup_max_degree`. The capture path asserts each constraint's measured degree is `<= max_degree()` — which caught keccak_rnd's mu-gated (degree-3) IS_BIT during migration. Verified: workspace + all test targets compile; 169 stark tests and 96 prover constraint tests pass (including constraint_program_tests:: all_table_programs_match_folders and the per-table measured<=max_degree gates); `make lint` clean on all feature sets. * perf(stark): inline the emit forwarding onto the per-row hot path The RowDomain refactor turned emit_base/emit_ext into provided defaults that forward to emit_base_rows/emit_ext_rows, adding a call hop on the per-constraint-per-row prover path (vs #764's direct folder emit). A paired ABBA vs #764 (12 pairs) showed a real ~1.1% regression: paired-t 95% CI [-1.99%, -0.27%], Wilcoxon W=10 (significant), 10/12 pairs slower. Marking the forwarding defaults and ProverEvalFolder's emit_*_rows #[inline] collapses the hop back to a direct slice write. Pure codegen hint — no value or wire change, so cross-verification stays green.

MauroToscano added 30 commits July 1, 2026 21:56

docs(gpu-constraint-eval): single-source constraints plan + constrain…

e72e1b3

…t front-end survey

docs(gpu-constraint-eval): retire golden-proof gate (nondeterministic…

dece3d9

… by design); adopt cross-version verification + random-row differentials

sscs: convert LT table to single-source ConstraintSet + differential …

4547890

…test

sscs: convert DVRM + SHIFT tables to single-source ConstraintSet

51a07b9

sscs: convert MUL + LOAD tables to single-source ConstraintSet

48fab6a

sscs: convert ECSM + ECDAS + EC_SCALAR tables + differential tests fo…

5e247ef

…r all 8 tables

spike(stark): use iterator comparisons in constraint_set_tests_b (cli…

1bb9ff3

…ppy)

Merge branch 'feat/sscs-tables-b' into feat/single-source-constraints…

e6cd301

…-switch # Conflicts: # prover/src/tests/mod.rs

MauroToscano mentioned this pull request Jul 2, 2026

perf(stark): borrow trace rows in place for prover transition eval #767

Closed

MauroToscano added 3 commits July 2, 2026 16:54

prover: drop a redundant Multiplicity clone in shift bus setup

30b6148

Found by clippy::redundant_clone sweeping for pointless clones; the only hit in production code (setup-time, cosmetic). The remaining test/example hits are left as-is.

MauroToscano marked this pull request as ready for review July 2, 2026 20:21

ColoCarletti approved these changes Jul 2, 2026

View reviewed changes

MauroToscano added 5 commits July 2, 2026 18:31

MauroToscano mentioned this pull request Jul 3, 2026

refactor(stark): derive constraint metadata from the single eval body #772

Merged

MauroToscano added 2 commits July 3, 2026 16:57

Merge branch 'main' into feat/single-source-constraints-switch

4fa8c0a

MauroToscano enabled auto-merge July 3, 2026 20:05

MauroToscano disabled auto-merge July 3, 2026 20:05

MauroToscano mentioned this pull request Jul 3, 2026

feat(stark): field-generic constraint IR + single-body ConstraintBuilder framework #763

Closed

MauroToscano added this pull request to the merge queue Jul 3, 2026

Merged via the queue into main with commit 0131c03 Jul 3, 2026
13 checks passed

MauroToscano deleted the feat/single-source-constraints-switch branch July 3, 2026 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(stark): single-source constraints — one definition per constraint, verified cross-version [stacked on #763]#764

feat(stark): single-source constraints — one definition per constraint, verified cross-version [stacked on #763]#764
MauroToscano merged 55 commits into
mainfrom
feat/single-source-constraints-switch

MauroToscano commented Jul 2, 2026 •

edited

Loading

Uh oh!

MauroToscano commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

MauroToscano commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

MauroToscano commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The proof, first

What this PR is

Shape of the result

Verification (beyond cross-version)

Design record

Performance

Pre-review sweep

Uh oh!

MauroToscano commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

ABBA tiebreaker — 73d33d4ea5 vs main (32 pairs)

Uh oh!

MauroToscano commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

ABBA tiebreaker — 3638d7a1c7 vs main (32 pairs)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MauroToscano commented Jul 2, 2026 •

edited

Loading

ABBA tiebreaker — `73d33d4ea5` vs `main` (32 pairs)

ABBA tiebreaker — `3638d7a1c7` vs `main` (32 pairs)