Skip to content

M9 finish (2/2): §14.9 differential / reference-parser tester#35

Merged
guyo13 merged 1 commit into
mainfrom
claude/youthful-hamilton-tncsqv
Jul 2, 2026
Merged

M9 finish (2/2): §14.9 differential / reference-parser tester#35
guyo13 merged 1 commit into
mainfrom
claude/youthful-hamilton-tncsqv

Conversation

@guyo13

@guyo13 guyo13 commented Jul 2, 2026

Copy link
Copy Markdown
Owner

M9 finish, task 2 of 2 — the §14.9 differential tester

Builds on the Task-1 regrown corpus (#34, merged). This is the technique that catches a recovery-classifier error by construction — two independent implementations disagreeing — rather than probabilistically (a fuzzer happening to hit the trigger). It exists precisely because the issue-#26 sentinel hole was that class of bug: a classifier mis-mapping one byte pattern.

What's here

  • tests/differential.rs — an independent reference segment parser run alongside the production recover_segment classifier, with an exact-match oracle on the SegClass variant and its offset/max_lsn.
  • Independence (the hard rule): the reference calls no production parse code — not record::decode, not recover_segment, not the segment.rs read helpers. It re-derives the §5.2/§5.3 constants and re-implements the length-bound check, the all-zero-header sentinel rule (post Durability contract correctness issue - Sentinel silent truncation #26), the CRC-validation ordering, the bounded tail-vs-corruption forward scan (lsn >= expected, bound max+28), and the sealed-vs-active distinction from raw bytes. It uses only the shared crc32c primitive (a dependency, not parse logic — re-implementing CRC-32C would test the crate, not the parser).
  • Post-Durability contract correctness issue - Sentinel silent truncation #26 contract: the reference treats a rec_type == 0 record with a non-zero CRC as Invalid (→ TornMidLog interior / torn-tail at the end), never a clean sentinel — the correct current spec.

Inputs (both parsers see identical bytes)

  1. Scenario matrix — 184 cases, enumerating every classification arm: clean runs (payload sizes 0/1/7/8/max), torn tails (CRC flip / rec_type→0 / extend-len / reserved-type / pad-flip), interior corruption (incl. the issue-Durability contract correctness issue - Sentinel silent truncation #26 rec_type→0 vector), LSN gaps, mid-run sentinel, physical truncation, within/beyond-scan-bound continuations, len>max, non-1 bases — each for active and sealed segments. This is the exact-match oracle.
  2. The committed Task-1 corpora (fuzz/corpus/{recovery,structure,decode,model}) as raw segment bodies — 1666 inputs of real fuzzer-discovered byte patterns the two parsers must still agree on.

Green here: 184 + 1666 agree, ~6 s.

Falsifiability (§14.0.3)

Injecting the pre-#26 naive rec_type == 0 ⇒ sentinel rule into the reference makes the differential fire —

DIVERGENCE [torn-last-zero_rectype] … production=Truncated { offset: 128, max_lsn: 2 } reference=Clean { max_lsn: 2 }
DIVERGENCE [corpus/recovery]        … production=Truncated { offset: 64,  max_lsn: 0 } reference=Clean { max_lsn: 0 }

— i.e. the harness would have caught the sentinel bug. Reverted.

Scope note (honest, per the kickoff)

The corpus is consumed as raw segment bodies rather than by re-decoding each fuzz target's arbitrary envelope (which would duplicate those generators and be fragile). The differential property "both parsers agree on these bytes" holds regardless of how the bytes were produced; the envelope-specific deep states are covered exhaustively by the scenario matrix. No §14.3-model-oracle→differential wiring (corpus-driven is acceptable for v1 per the kickoff).

src/ change & release impact

The only src/ change is a thin #[cfg(feature="fuzzing")] recover_segment_classify accessor in lib.rs (returns a plain SegClass). No public API widening, no second production parser. The test is #[cfg(feature="fuzzing")], so default cargo build/cargo test compile it to nothing — zero release impact.

CI

Per-PR blocking differential (§14.9) job in ci.yml (cargo test --features fuzzing --test differential, ~6 s). A classification divergence is a real recovery-classifier bug and reds the PR — same posture as the fuzz smoke; never gates an H1 dispatch.

Verification

cargo fmt --check, cargo clippy --all-targets -D warnings (default and --features fuzzing), default cargo test (differential compiles to nothing), cargo test --features fuzzing --test differential, MSRV 1.85 cargo check --all-targets --locked, actionlint — all green.

Docs

§14.9 marked implemented, §14.11 per-PR row, §14.12 D4/D5/D10/D11 rows now cite the differential, §14.13 DoD bullet, CLAUDE.md.

After this

M9 finish follow-ups remain (separate PRs, per the designer's close-out): (1+2) a deterministic interior-rec_type→0 permanent regression seed + the fuzz-coverage trust-model writeup, then (3) the DoD reconciliation audit. After those, M9 is software-complete (only the passive fuzz-hours / multi-hour-soak runner gates remain).

🤖 Generated with Claude Code


Generated by Claude Code

Add tests/differential.rs: an independent, spec-derived reference segment
parser run alongside the production recover_segment classifier, with an
exact-match oracle on classification. Catches a recovery-classifier error by
construction (two implementations disagreeing) rather than probabilistically —
the class of bug the issue #26 sentinel hole was.

The reference calls no production parse code (not record::decode, not
recover_segment, not the segment.rs helpers): it re-derives the §5.2/§5.3/§8.2
constants and re-implements the length-bound check, the all-zero-header
sentinel rule (post #26), the CRC-validation ordering, the bounded
tail-vs-corruption forward scan, and the sealed-vs-active distinction from raw
bytes. It uses only the shared crc32c primitive (a dependency, not parse logic).

Inputs: a deterministic 184-case scenario matrix enumerating every arm (clean
runs, torn tails, interior corruption incl. the rec_type→0 vector, reserved
types, LSN gaps, physical truncation, within/beyond-scan-bound continuations,
len>max, non-1 bases; active + sealed) PLUS the Task-1 regrown corpora as raw
segment bodies (1666 inputs). Both parsers must return the identical SegClass
variant AND offsets/max_lsn. Green: 184 + 1666 agree, ~6 s.

Falsifiability shown: injecting the pre-#26 naive rec_type==0 ⇒ sentinel rule
into the reference makes the differential fire (production=Truncated vs
reference=Clean) on the torn-last-zero_rectype scenario and real corpus
entries — i.e. it would have caught the sentinel bug — then reverted.

The only src/ change is a thin #[cfg(feature="fuzzing")] recover_segment_classify
accessor in lib.rs (no public API widening, no second production parser). The
test is #![cfg(feature="fuzzing")], so default cargo build/test are unaffected
(zero release impact). CI: per-PR blocking `differential (§14.9)` job in ci.yml
(a divergence reds the PR, never gates H1). Docs: §14.9 implemented, §14.11
per-PR row, §14.12 D4/D5/D10/D11 rows, §14.13 DoD bullet.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Rpbwt9JT56hQvVXiqTS131
@guyo13 guyo13 merged commit 1db1b93 into main Jul 2, 2026
9 checks passed
guyo13 pushed a commit that referenced this pull request Jul 2, 2026
…nly)

The closing M9 truthfulness pass. New §14.12.1 subsection reconciles the
traceability matrix against the actual tree, with every covering test
grep-verified to exist and each D1–D12 row marked passing (a currently-existing
software test) vs contingent (a named physical-layer gate).

Verified (all 13 cited symbols confirmed present):
- D5 traces to the sentinel-fix tests (rec_type_zeroed_interior_is_fatal_tornmidlog
  + tail companion) AND the §14.9 differential — not anything stale.
- D4/D10/D11 cite the §14.9 differential; those PR #35 edits landed on main and
  name real differential_* symbols.
- loom is absent from the matrix, the DoD, and the code (grep -i loom over
  src/ tests/ .github/ Cargo.toml finds only the docs "why not" note).
- §14.9 differential present and implemented.
- Each contingent gate names its unmet condition: fuzz ≥24 CPU-h/target since
  2b198e7 (dedicated runner), soak multi-hour run, H1 owner hardware, LazyFS
  FUSE runner, §14.4d documented negative result.
- Structural-vs-substantive caveat flagged: the D11 bounded-scan counter is a
  drift guard (the loop window IS scan_bound), not an input-bug finder; the
  substantive D11 proof is F1–F4's crash-free surface + the differential.

§14.13's "every row has a passing test" bullet now points at §14.12.1. Docs-only
(docs/wal_design_v6.md + CLAUDE.md); no src/, no test, no workflow.

After this, M9 is software-complete; the only remaining §14.13 gates are passive
runner time (fuzz 24-CPU-h/target, multi-hour soak), accruing off the critical
path alongside H1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Rpbwt9JT56hQvVXiqTS131
guyo13 pushed a commit that referenced this pull request Jul 2, 2026
…nly)

The closing M9 truthfulness pass. New §14.12.1 subsection reconciles the
traceability matrix against the actual tree, with every covering test
grep-verified to exist and each D1–D12 row marked passing (a currently-existing
software test) vs contingent (a named physical-layer gate).

Verified (all 13 cited symbols confirmed present):
- D5 traces to the sentinel-fix tests (rec_type_zeroed_interior_is_fatal_tornmidlog
  + tail companion) AND the §14.9 differential — not anything stale.
- D4/D10/D11 cite the §14.9 differential; those PR #35 edits landed on main and
  name real differential_* symbols.
- loom is absent from the matrix, the DoD, and the code (grep -i loom over
  src/ tests/ .github/ Cargo.toml finds only the docs "why not" note).
- §14.9 differential present and implemented.
- Each contingent gate names its unmet condition: fuzz ≥24 CPU-h/target since
  2b198e7 (dedicated runner), soak multi-hour run, H1 owner hardware, LazyFS
  FUSE runner, §14.4d documented negative result.
- Structural-vs-substantive caveat flagged: the D11 bounded-scan counter is a
  drift guard (the loop window IS scan_bound), not an input-bug finder; the
  substantive D11 proof is F1–F4's crash-free surface + the differential.

§14.13's "every row has a passing test" bullet now points at §14.12.1. Docs-only
(docs/wal_design_v6.md + CLAUDE.md); no src/, no test, no workflow.

After this, M9 is software-complete; the only remaining §14.13 gates are passive
runner time (fuzz 24-CPU-h/target, multi-hour soak), accruing off the critical
path alongside H1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Rpbwt9JT56hQvVXiqTS131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants