Conversation
|
/ai-review |
Codex Code ReviewFindings
No other safety, correctness, or significant performance issues found in the PR diff. Static review only; I did not build or run tests per the review constraints. |
Review: cache verifying key + commitments (recursion opt)The mechanical refactor is clean — the 1. High — Soundness rationale is incorrect. Sourcing the preprocessed-table commitments from a prover-supplied 2. Medium — Panic on untrusted input. Note: The production |
AI ReviewPR #727 · 13 changed files Findings
Status column reflects the verdict from the verifier: deepseek-verifier (openrouter/deepseek/deepseek-v4-pro). AI-001: VmProof gains required vk_digest field — breaks old proof artifacts
Claim Adding the required Evidence VmProof struct in prover/src/lib.rs:160-183 now requires vk_digest. Minimal tests (prover/src/tests/prove_elfs_tests.rs:141) set it to Suggested fix If backward compatibility with archived proof bundles matters, mark the field with AI-002: VmVerifyingKey::from_elf_and_options register_init footgun
Claim VmVerifyingKey::from_elf_and_options takes Evidence prover/src/vkey.rs:121-149 uses Suggested fix Either drop the AI-003: compute_digest uses allocating to_allocvec per verify
Claim VmVerifyingKey::compute_digest calls Evidence prover/src/vkey.rs:154-160 uses Suggested fix If measured to matter, replace with a streaming hash ( Reviewer Lanes
Verification Lanes
Native Codex and Claude reviews run separately and post their own comments. They are not included in this structured provenance report. Raw lane outputs, candidates, final issues, and model metrics are uploaded as workflow artifacts. |
b8aa349 to
7590c91
Compare
71e9375 to
32319b1
Compare
|
/ai-review |
Codex Code ReviewFindings
No production safety/security issues found in the changed code from static review. I did not build or run tests per the review constraints. |
|
Review — vkey caching + digest binding Reviewed the full diff (the PR branch isn't checked out in this sandbox; base main lacks these files, so this is diff-only). Overall: solid. The soundness story is coherent and well-documented, and test coverage is thorough (per-table tamper rejection, short-pages, options-mismatch, wrong-version, round-trip, verify-equivalence). No safety or correctness issues found. Notes below are minor. Soundness — checks out
Minor
Nothing blocking. |
Add `VmVerifyingKey` (prover/src/vkey.rs): host-derived cache of the five preprocessed-table Merkle commitments (BITWISE, DECODE, REGISTER, KECCAK_RC, per-PAGE). `VmAirs::new_with_vkey` / `verify_with_options_with_vkey` take the cached commitments instead of recomputing them — recomputation is ~87% of verifier cycles inside the recursion guest. Soundness is preserved by Fiat-Shamir. The recursion and deserialize-only guests and the smoke test now encode the vkey into the postcard blob `(VmProof, elf, opts, vkey)`.
A prover-supplied vkey defeats the preprocessed-commitment check: Fiat-Shamir only catches post-hoc tampering, not a coordinated prover committing to a forged table with a matching vkey. Bind keccak(vkey) as vk_digest: embed ProofOptions in the vkey (query count and grinding factor pin no commitment), stamp the digest into VmProof and the statement transcript (V3), check it in verify before STARK work, and commit vk_digest || inner output from the recursion guest so the outer verifier can compare against a digest derived from the trusted inner ELF. Also validate vkey version/page-count instead of panicking on short pages, and reject on options mismatch.
The comment should still say when it needs bumping, not why it was bumped the last time.
The host roundtrip test still decoded (VmProof, elf, opts); postcard discards trailing bytes, so it silently skipped the vkey and the vkey verify path the guest actually exercises.
|
/ai-review |
Codex Code ReviewFindings
Notes I reviewed statically only, per the sandbox instructions. No build or tests were run. |
Review: vkey + commitment caching (recursion opt)I traced the full soundness path — the Soundness (verified, no issues)
Minor / optional (non-blocking)
Nice work on the test coverage — the tamper-rejection matrix (bitwise/decode/register/keccak_rc/page/zero-init/options/version/short-pages) plus the forged-program and wrong-ELF recursion tests cover the interesting cases. |
Add
VmVerifyingKey(prover/src/vkey.rs): a host-derived cache of the five preprocessed-table Merkle commitments (BITWISE, DECODE, REGISTER, KECCAK_RC, per-PAGE) plus theProofOptionsthey were derived under.VmAirs::new_with_vkey/verify_with_options_with_vkeytake the cached commitments instead of recomputing them — recomputation is ~87% of verifier cycles inside the recursion guest.Soundness
The vkey is trusted input — Fiat-Shamir only catches a vkey inconsistent with the proof (post-hoc tampering), not a coordinated prover that commits to a forged preprocessed table and supplies a matching vkey. The binding is
vk_digest = keccak(postcard(vkey)):VmProof.vk_digestand it is bound into the Fiat-Shamir statement (LAMBDAVM_STARK_STATEMENT_V3).InvalidVerifyingKeyerror before any STARK work. Malformed vkeys (wrong version, shortpagesvec, mismatched options) are rejected the same way instead of panicking.vk_digest ‖ inner public outputinstead of[1], so the outer verifier checks which vkey was used in-guest against a digest derived from the trusted inner ELF — every guest input is prover-supplied, so this outer check is what makes the recursion sound. The smoke test performs it.ProofOptionsbecause FRI query count and grinding factor affect soundness but influence no commitment; nothing else would pin them.The recursion guest and the smoke test encode the vkey into the postcard blob
(VmProof, elf, opts, vkey).Breakdown
RECURSION GUEST PROFILE (blowup=2, 1 queries)
Total cycles : 114975927
Exec time : 2.648427084s
Per-step cycle breakdown (latest-marker state machine):
bucket cycles %
0. setup (alloc init + postcard decode) 21816885 18.98%
Unique PCs : 89622
RECURSION GUEST PROFILE (blowup=8, 73 queries)
Total cycles : 2968430174
Exec time : 91.293472417s
Per-step cycle breakdown (latest-marker state machine):
bucket cycles %
0. setup (alloc init + postcard decode) 698028452 23.52%
Unique PCs : 89822
From baseline on #726:
While I left the combined step in the description, note 1. collapsed 1. and 2. in the current one and took 86.90% vs <25% and 97.93% vs <6%, and total cycles diminish by comparable amounts.
Validation
make testpasses (only the 2 CUDA tests fail — no GPU);vkey_tests(10, incl. digest/short-pages/options-mismatch rejection) andstatement_testspass.make test-profile-recursionpasses and prints the histograms: