guyo13 · guyo13 · Jul 2, 2026 · Jul 2, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -139,6 +139,25 @@ jobs:
         run: |
           cargo +nightly miri test --lib -- record:: crc:: lsn:: config::
 
+  differential:
+    # §14.9 — differential / reference-parser tester. An independent, spec-derived
+    # reference segment parser (`tests/differential.rs`) is run alongside the
+    # production `recover_segment` classifier over a deterministic scenario matrix
+    # AND the committed fuzz corpora; any divergence in classification is a real
+    # recovery-classifier bug and reds the PR (same posture as the fuzz smoke,
+    # never gates an H1 dispatch). Fast (~6 s over the minimized corpus), so it
+    # runs per-PR. Needs `--features fuzzing` for the `recover_segment_classify`
+    # accessor (test-only, zero release impact).
+    name: differential (§14.9)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install stable toolchain
+        uses: dtolnay/rust-toolchain@stable
+      - uses: Swatinem/rust-cache@v2
+      - name: differential reference-parser check
+        run: cargo test --features fuzzing --test differential
+
   dirfsync-presence:
     # M8 §14.4d Tier 1 (PRIMARY): the deterministic, FS-independent regression
     # guard for the roll-time directory fsync (§7.4 step 5). It straces the roll

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -109,6 +109,7 @@ The entire value of this component is **correct behavior under crashes and fault
 
 ## Project status (keep this updated)
 
+- **LATEST (2026-07-02): M9 finish task 2 — §14.9 differential / reference-parser tester LANDED, green here.** New `tests/differential.rs` (`#![cfg(feature="fuzzing")]`): an **independent** reference segment parser — re-deriving the §5.2/§5.3/§8.2 constants and re-implementing the length-bound check, the **all-zero-header sentinel** rule (post issue #26), the CRC-validation ordering, the bounded tail-vs-corruption forward scan, and the sealed-vs-active distinction from raw bytes, calling **no** production parse code (only the shared `crc32c` primitive) — run against production via a new `#[cfg(feature="fuzzing")]` `recover_segment_classify` accessor with an **exact-match** oracle on the `SegClass` variant *and* offsets/`max_lsn`. Inputs: a deterministic **184-case scenario matrix** (clean runs, torn tails, interior corruption incl. the `rec_type→0` vector, reserved types, LSN gaps, physical truncation, within/beyond-scan-bound continuations, `len>max`, non-1 bases; active + sealed) **plus** the Task-1 regrown corpora as raw segment bodies (**1666 inputs**) — catches a classifier error **by construction** (two implementations disagreeing), the class of bug the #26 sentinel hole was. **Green: 184 + 1666 agree, ~6 s.** **Falsifiability shown:** injecting the pre-#26 naive `rec_type==0 ⇒ sentinel` rule into the reference makes the differential fire (`production=Truncated … reference=Clean`) on both the `torn-last-zero_rectype` scenario and real corpus entries, then reverted. **CI:** per-PR blocking `differential (§14.9)` job in `ci.yml` (`cargo test --features fuzzing --test differential`; a divergence reds the PR, never gates H1). **`src/` change is only the feature-gated accessor** (no public API widening, no second production parser; default `cargo build`/`cargo test` unaffected — the test compiles to nothing without `fuzzing`). Docs: §14.9 implemented, §14.11 per-PR row, §14.12 D4/D5/D10/D11 rows, §14.13 DoD bullet. **After this: M9 finish follow-ups 1+2 (interior rec_type→0 permanent seed + coverage trust-model) then 3 (DoD audit); then M9 is software-complete.**
 - **LATEST (2026-07-01): M9 finish task 1 — F1–F4 corpus regrow + `cmin` on the post-sentinel-fix format, and the §14.13 fuzz gate's N pinned at 24 CPU-hours/target.** The sentinel fix (`2b198e7`, all-zero-header) changed how `rec_type==0, crc≠0` is classified (sentinel → `Invalid` → `TornMidLog`/torn-tail — a path that did not exist pre-fix), so the pre-fix coverage maps were stale and any §14.13 CPU-hours must be counted on the **current** format. Regrew each target (`cargo +nightly fuzz run <t> -max_total_time≈300` on top of the existing corpus) then `cargo fuzz cmin`'d to the coverage-preserving set: **recovery 174→316, structure 130→129, decode 17→40, model 321→348** minimized entries; per-target coverage rose (**recovery 780→892, structure 561→592, model 798→839**); **zero crashes** during the regrow (artifacts dirs empty). **Pinned N** — the previously vacuous "N CPU-hours" is now **≥ 24 CPU-hours per target (96 total) accumulated since `2b198e7`**, with the "since the last format change" clock made explicit (a format change resets it + mandates this regrow) — written into the §14.13 fuzz row, the `fuzz.yml` header + in-run `::warning::` banner, and a new **regrow-log** section in `fuzz/README.md` recording the `2b198e7` regrow as precedent. **Does NOT discharge the gate** — it stays CONTINGENT/OPEN pending the 24 h/target run on a dedicated runner; this only makes the contingent statement non-vacuous and honestly-anchored. No `src/` change (corpus + docs + one workflow comment/banner only). Lands as PR #1 of the two M9-finish PRs (task 2 = §14.9 differential harness reuses this regrown corpus).
 - **LATEST (2026-07-01): M9 — CI-matrix tidy-up LANDED (§14.11); this is the LAST M9 build slice. M9 is now feature-complete, with only the standing owner/dedicated-runner *observations* open.** **Docs-only reconciliation** (no `src/`, no workflow behavior change — the 8 workflows already implemented the intended matrix; the spec had simply not caught up). Rewrote §14.11 into a **faithful index** of what actually runs: the **Per-PR (blocking)** row now enumerates the real `ci.yml` jobs — `rustfmt+clippy`, `build+test` (§14.1 vectors + §14.2 reduced proptest + §14.7 zero-alloc + `bench --no-run` + the §14.6 `!Sync` trybuild compile-fail + dir-lock/`Send` tests), `MSRV 1.85`, `Miri (codec subset)`, **`fuzz smoke (F1/F2/F3/F4)` (blocking — a crash reds the PR)**, `§14.4d dir-fsync presence (strace)` — plus the paths-filtered per-PR gates (`m8-macos.yml` H4 Half A, `lazyfs.yml`, `m8.yml`); the **Nightly (scheduled)** row names each workflow with its staggered cron (`bench.yml` 03:17 / `fuzz.yml` 04:17 / `m8-dmflakey.yml` 04:23 / `soak.yml` 05:17, all CONTINGENT + dispatch), and honestly flags that full-iteration §14.2/§14.3 and the §14.9 differential are **not yet** automated as nightly (covered at PR granularity by the reduced proptest + M6 `model_oracle`). Added the **FS-matrix honesty note** (per plan Slice 9): the FS matrix is meaningful only for the durability/metadata-fault gates (LazyFS/H3/§14.4d); the byte-level codec/CRC/recovery-classifier logic the §14.5 fuzz + §14.6 Miri lanes exercise is **filesystem-independent**, so those lanes are **not** multiplied across the matrix — over-claiming an "all-FS fuzz matrix" would be dishonest. **Honest close-out:** no long-run gate is marked green — the §14.13 fuzz row (N-CPU-hours) and the multi-hour soak stay OPEN-pending a dedicated runner; Miri clean, `!Sync`, and zero-alloc rows are DONE. `cargo fmt --check` / `clippy --all-targets -D warnings` / `cargo test` unaffected (docs-only). **M9 remaining: only the F1–F4 N-CPU-hour + multi-hour-soak release-gate observations on owner/dedicated hardware — no more in-session build work.**
 - **LATEST (2026-06-29): M9 — soak / endurance LANDED (§14.10), short run green here; the multi-hour gate stays CONTINGENT.** New `tests/soak.rs` (`#[ignore]`, env-driven `WAL_SOAK_SECONDS` default 3 / `WAL_SOAK_SEED` / optional `WAL_SOAK_EVIDENCE`): drives a **single long-lived `Wal`** through a weighted randomized loop — append (boundary-biased 0/1/8/`max_record_size`/random) / commit (timed into an `hdrhistogram`) / `checkpoint(durable)` / process-crash-recover (drop+reopen) — over a 4 KiB-segment/256 B-record config so rolls/splits/checkpoints fire constantly. After **every** recover it re-checks the §14.3 refinement envelope against an independent in-memory oracle: **D1/D3** (recovered `durable_lsn` ≥ committed watermark), **D8** (oldest ≤ authorized `up_to`+1, monotone), **D2/D6** (dense byte-identical replay `oldest..=durable` via `reader_from(0)`). Four resource monitors with bounded-growth gates: **fd** (`/proc/self/fd`, `peak ≤ baseline+32`), **disk** (a deterministic **per-checkpoint floor** — the soak always checkpoints to `durable_lsn`, so every sealed segment is superseded and exactly the active segment must remain right after `checkpoint(durable)`; a reclaim-N−1-of-N leak reds on cycle 1, not after 16 segments accrue — backstopped by a `peak ≤ 16×segment_size` runaway ceiling), **RSS** (`/proc/self/statm`×pagesize, `≤ baseline+64 MiB`; §8.5 — recovery materializes no payloads), **commit p999 ≤ 2 s**. Deterministic seeded LCG (no RNG dep) ⇒ reproduces from the seed; oracle self-prunes below `max_ckpt`/`oldest` so it never trips its own RSS watch. **Short run green here: `WAL_SOAK_SECONDS=4` ⇒ ~29 k ops / ~6.5 k commits / ~2 k checkpoints / ~2 k recoveries, fd 6→6, disk floor==1 every checkpoint, RSS +0.5 MiB, p999 sub-ms–7 ms.** **Falsifiability shown (honest form, per review)**: injecting a per-cycle leak (`deletable_prefix_len` reclaims N−1 of N) reds the floor on cycle 1 (`found 2 *.wal files`), then reverted — the real leak class, not a sub-working-set threshold. Wrapper `scripts/m9/soak.sh` (release build, scrapes the one-line JSON summary, re-emits a §5 evidence ledger via `scripts/m8/evidence.sh`, loud SHORT-vs-gate framing) + `.github/workflows/soak.yml` (schedule 05:17 + dispatch, **NOT per-PR**, contingent banner, uploads the evidence artifact). **No `src/` change** (public API only). **Honest framing (same stopgap as LazyFS/bench):** a SHORT run proves the driver/monitors/oracle work but is **not** the gate — the §14.13 soak is a **multi-hour** run on a dedicated runner with zero regression/violation. `cargo test --test soak` compiles, `clippy --test soak`, `shellcheck scripts/m9/soak.sh`, `actionlint soak.yml` clean. **Remaining M9:** CI-matrix tidy-up (§14.11); the F1–F4 N-CPU-hour + multi-hour-soak release-gate observation on a dedicated runner.