Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ The entire value of this component is **correct behavior under crashes and fault

## Project status (keep this updated)

- **LATEST (2026-07-01): M9 — CI-matrix tidy-up LANDED (§14.11); this is the LAST M9 build slice. M9 is now feature-complete, with only the standing owner/dedicated-runner *observations* open.** **Docs-only reconciliation** (no `src/`, no workflow behavior change — the 8 workflows already implemented the intended matrix; the spec had simply not caught up). Rewrote §14.11 into a **faithful index** of what actually runs: the **Per-PR (blocking)** row now enumerates the real `ci.yml` jobs — `rustfmt+clippy`, `build+test` (§14.1 vectors + §14.2 reduced proptest + §14.7 zero-alloc + `bench --no-run` + the §14.6 `!Sync` trybuild compile-fail + dir-lock/`Send` tests), `MSRV 1.85`, `Miri (codec subset)`, **`fuzz smoke (F1/F2/F3/F4)` (blocking — a crash reds the PR)**, `§14.4d dir-fsync presence (strace)` — plus the paths-filtered per-PR gates (`m8-macos.yml` H4 Half A, `lazyfs.yml`, `m8.yml`); the **Nightly (scheduled)** row names each workflow with its staggered cron (`bench.yml` 03:17 / `fuzz.yml` 04:17 / `m8-dmflakey.yml` 04:23 / `soak.yml` 05:17, all CONTINGENT + dispatch), and honestly flags that full-iteration §14.2/§14.3 and the §14.9 differential are **not yet** automated as nightly (covered at PR granularity by the reduced proptest + M6 `model_oracle`). Added the **FS-matrix honesty note** (per plan Slice 9): the FS matrix is meaningful only for the durability/metadata-fault gates (LazyFS/H3/§14.4d); the byte-level codec/CRC/recovery-classifier logic the §14.5 fuzz + §14.6 Miri lanes exercise is **filesystem-independent**, so those lanes are **not** multiplied across the matrix — over-claiming an "all-FS fuzz matrix" would be dishonest. **Honest close-out:** no long-run gate is marked green — the §14.13 fuzz row (N-CPU-hours) and the multi-hour soak stay OPEN-pending a dedicated runner; Miri clean, `!Sync`, and zero-alloc rows are DONE. `cargo fmt --check` / `clippy --all-targets -D warnings` / `cargo test` unaffected (docs-only). **M9 remaining: only the F1–F4 N-CPU-hour + multi-hour-soak release-gate observations on owner/dedicated hardware — no more in-session build work.**
- **LATEST (2026-06-29): M9 — soak / endurance LANDED (§14.10), short run green here; the multi-hour gate stays CONTINGENT.** New `tests/soak.rs` (`#[ignore]`, env-driven `WAL_SOAK_SECONDS` default 3 / `WAL_SOAK_SEED` / optional `WAL_SOAK_EVIDENCE`): drives a **single long-lived `Wal`** through a weighted randomized loop — append (boundary-biased 0/1/8/`max_record_size`/random) / commit (timed into an `hdrhistogram`) / `checkpoint(durable)` / process-crash-recover (drop+reopen) — over a 4 KiB-segment/256 B-record config so rolls/splits/checkpoints fire constantly. After **every** recover it re-checks the §14.3 refinement envelope against an independent in-memory oracle: **D1/D3** (recovered `durable_lsn` ≥ committed watermark), **D8** (oldest ≤ authorized `up_to`+1, monotone), **D2/D6** (dense byte-identical replay `oldest..=durable` via `reader_from(0)`). Four resource monitors with bounded-growth gates: **fd** (`/proc/self/fd`, `peak ≤ baseline+32`), **disk** (a deterministic **per-checkpoint floor** — the soak always checkpoints to `durable_lsn`, so every sealed segment is superseded and exactly the active segment must remain right after `checkpoint(durable)`; a reclaim-N−1-of-N leak reds on cycle 1, not after 16 segments accrue — backstopped by a `peak ≤ 16×segment_size` runaway ceiling), **RSS** (`/proc/self/statm`×pagesize, `≤ baseline+64 MiB`; §8.5 — recovery materializes no payloads), **commit p999 ≤ 2 s**. Deterministic seeded LCG (no RNG dep) ⇒ reproduces from the seed; oracle self-prunes below `max_ckpt`/`oldest` so it never trips its own RSS watch. **Short run green here: `WAL_SOAK_SECONDS=4` ⇒ ~29 k ops / ~6.5 k commits / ~2 k checkpoints / ~2 k recoveries, fd 6→6, disk floor==1 every checkpoint, RSS +0.5 MiB, p999 sub-ms–7 ms.** **Falsifiability shown (honest form, per review)**: injecting a per-cycle leak (`deletable_prefix_len` reclaims N−1 of N) reds the floor on cycle 1 (`found 2 *.wal files`), then reverted — the real leak class, not a sub-working-set threshold. Wrapper `scripts/m9/soak.sh` (release build, scrapes the one-line JSON summary, re-emits a §5 evidence ledger via `scripts/m8/evidence.sh`, loud SHORT-vs-gate framing) + `.github/workflows/soak.yml` (schedule 05:17 + dispatch, **NOT per-PR**, contingent banner, uploads the evidence artifact). **No `src/` change** (public API only). **Honest framing (same stopgap as LazyFS/bench):** a SHORT run proves the driver/monitors/oracle work but is **not** the gate — the §14.13 soak is a **multi-hour** run on a dedicated runner with zero regression/violation. `cargo test --test soak` compiles, `clippy --test soak`, `shellcheck scripts/m9/soak.sh`, `actionlint soak.yml` clean. **Remaining M9:** CI-matrix tidy-up (§14.11); the F1–F4 N-CPU-hour + multi-hour-soak release-gate observation on a dedicated runner.
- **LATEST (2026-06-29): M9 — loom item REMOVED as not-applicable (PR #30 closed unmerged); spec corrected, not silently deleted.** A loom publish-barrier harness was built (PR #30) but **closed without merging**: the test imported only `loom::sync::*` (no `open_wal` symbol) and hand-built a writer/reader cursor *inside the test*, so it was green independently of the crate — it validated a textbook Release/Acquire fact, not this code. The deeper reason it can't be fixed: the writer is **`!Sync` by construction** (`PhantomData<Cell<()>>`, §6.2, proven statically by the #29 trybuild test), so there is **no intra-WAL shared-memory concurrency for loom to model**, and a static `!Sync` proof is *stronger* than an empirical model check. The only Release/Acquire barrier is the **integrator's** publish cursor (out of scope, §1/§15.7); the WAL's actual obligation there (`on_durable` fires strictly after durability, monotonically, writer thread) is covered by the **§15.8 `DurabilityObserver` contract tests**. **Correction (docs-only):** §14.6 loom bullet → a normative *not-applicable* note (conditioned on the writer staying `!Sync`, so a future `Sync` redesign re-opens it); §14.6 concurrency cross-ref + the tooling list de-loom'd; v6 changelog entry added; `loom` dropped from the dev-tools list. **No `loom` dev-dep / `cfg(loom)` / `loom.yml` reaches `main`** (they lived only on the closed PR). Same discipline as the §14.4d negative result — the characterized "why not" is the deliverable. `cargo fmt --check` / `clippy --all-targets -D warnings` / `cargo test` unaffected (docs-only). **Remaining M9:** soak; CI-matrix tidy-up; the F1–F4 N-CPU-hour release-gate observation.
- **LATEST (2026-06-29): M9 — `!Sync` compile-fail (trybuild) + Send/dir-lock LANDED (§14.6), green here.** New `trybuild` dev-dep + `tests/ui.rs` runner + `tests/ui/wal_not_sync.rs` (+ pinned `tests/ui/wal_not_sync.stderr`): asserts `Wal<NullObserver>: Sync` does **not** compile — the `PhantomData<Cell<()>>` marker (§6.2) makes the handle `!Sync`, so concurrent writers are a compile error (`Cell<()>` cannot be shared between threads safely). The **`Send`** half is asserted positively by a new `wal::tests::handle_is_send`; the runtime **directory lock** was **already** covered by `wal::tests::second_open_is_locked` (second in-process `Wal::open` ⇒ `WalError::Locked` via the exclusive `flock` on `dir/LOCK`) — not duplicated. **trybuild `.stderr` fragility handled honestly:** it's toolchain-sensitive, so it runs **only in the per-PR `test` job (stable)** — the MSRV job is `cargo check`, which builds the runner but does not execute it, so the `.stderr` is compared against exactly one toolchain; regenerate with `TRYBUILD=overwrite cargo test --test ui` on a rustc bump (documented in `Cargo.toml`, `tests/ui.rs`, §14.6). Generated/verified on rustc 1.94.1; the diagnostic wording ("cannot be shared between threads safely") is stable. `cargo test` (22 ok-lines incl. `ui`/`handle_is_send`/`second_open_is_locked`), `clippy --all-targets -D warnings`, `cargo fmt --check`, **MSRV 1.85** (trybuild v1.0.117 builds) green. Docs: §14.6 two bullets + a new §14.13 single-writer DoD row. **Remaining M9:** loom publish-barrier; soak; CI-matrix tidy-up; the F1–F4 N-CPU-hour release-gate observation.
Expand Down
6 changes: 3 additions & 3 deletions docs/wal_design_v6.md
Original file line number Diff line number Diff line change
Expand Up @@ -623,11 +623,11 @@ Multi-hour randomized workload with periodic injected crashes+recoveries and che
*(**M9 — IMPLEMENTED** as `tests/soak.rs` (`#[ignore]`, env-driven `WAL_SOAK_SECONDS`/`WAL_SOAK_SEED`), driven by `scripts/m9/soak.sh` and run nightly/dispatch in `soak.yml`. It drives a **single long-lived `Wal`** through a weighted randomized loop — append (boundary-biased sizes 0/1/8/`max_record_size`/random), commit (timed into an `hdrhistogram`), `checkpoint(durable)`, and process-crash-recover (drop + reopen) — over a 4 KiB-segment/256 B-record config so rolls/splits/checkpoints fire constantly. After **every** recover it re-checks the §14.3 refinement envelope (D1/D3 durable ≥ committed watermark, D8 oldest ≤ authorized `up_to`+1 and monotone, D2/D6 dense byte-identical replay `oldest..=durable`) against an independent in-memory oracle, so a *correctness* regression under sustained load reds the run too — not just a leak. The four monitors are sampled every 128 ops with bounded-growth gates: **fd** (`/proc/self/fd`, `peak ≤ baseline+32`), **disk** (a deterministic **per-checkpoint floor** — because the soak always checkpoints to exactly `durable_lsn`, every sealed segment is fully superseded, so right after `checkpoint(durable)` exactly the active segment must remain (§9); a checkpoint that reclaims N−1 of N sealed segments reds on the **first** bad cycle, not after 16 accumulate — backstopped by a `peak ≤ 16×segment_size` ceiling for gross runaway between checkpoints), **RSS** (`/proc/self/statm`, `≤ baseline+64 MiB`; recovery materializes no payloads, §8.5), and **commit-latency** (p999 ≤ 2 s ceiling). Deterministic (seeded LCG, no RNG dep) so a failure reproduces from `WAL_SOAK_SEED`. **Honest framing (same stopgap as LazyFS/dm-flakey/bench):** a SHORT run (the per-dispatch default 300 s, and the in-session smoke) proves the driver/monitors/oracle work but is **NOT** the gate — the §14.13 soak gate is a **multi-hour** run with zero regression/violation; a short green run is CONTINGENT, a failure is a real bug. Falsifiability shown (honest form): injecting a per-cycle leak (`deletable_prefix_len` reclaims N−1 of N) reds the **floor** assertion on cycle 1 (`found 2 *.wal files`) — the real leak class, not a sub-working-set threshold; the oracle's refinement checks are the same ones the M6 harness demonstrated falsifiable. No `src/` change — the driver uses only the public API.)*

### 14.11 CI matrix
- **Per-PR (fast):** §14.1, §14.2 (reduced), §14.6 compile-fail + Miri subset, §14.7 alloc assertion (enforced, `ci.yml` `cargo test`) + `cargo bench --no-run` (benches must compile, can't bitrot), **§14.8 H4 Half A** *(M8: `m8-macos.yml` on `macos-latest`, paths-filtered to the durable-path sources + the test; per-PR + push-to-main + manual — a macOS-only `F_FULLFSYNC`-routing regression is invisible to the Linux PR CI because the `cfg(macos)` path does not compile there; `dtruss` Half B stays owner-run per #19)*.
- **Nightly:** full §14.2/§14.3 (high iteration), §14.4 LazyFS suite, §14.5 fuzz (time-boxed), §14.7 benchmarks + gates *(M7: `bench.yml`, schedule + manual; the gate runs **informational** on hosted runners until a controlled/pinned-governor runner makes the §14.7 thresholds enforceable — same stopgap as the LazyFS gate)*, §14.9 differential, **§14.8 H3-physical + §14.4d** *(M8: `m8-dmflakey.yml`, push-to-main + schedule + manual, not per-PR; hosted ubuntu VMs reach `dm-flakey`, so these are real CI gates there — ext4 hard with a source-confirmed block-layer EIO (#16) and a `drop_writes` positive control (#17), xfs/btrfs informational, **best-effort + loud skip** if a runner lacks dm-flakey)*. **H4 Half A** is per-PR not nightly — see Per-PR row addendum below.
- **Per-PR (fast, blocking).** The always-on `ci.yml` jobs (`push:main` + `pull_request` + dispatch): **`rustfmt + clippy`** (`-D warnings`, all targets); **`build + test`** = §14.1 codec/CRC vectors, §14.2 (reduced proptest) round-trip/property suites, §14.7 **zero-alloc assertion** (enforced) + `cargo bench --no-run` (benches must compile, can't bitrot), and the §14.6 **`!Sync` compile-fail** (`tests/ui.rs` trybuild, stable-only) + **directory-lock / `Send`** runtime tests — all under `cargo test`; **`MSRV (1.85)`** `cargo check --all-targets --locked`; **`Miri (codec subset)`** (§14.6, `record`/`crc`/`lsn`/`config` under `-Zmiri-strict-provenance` — the whole Miri-executable surface, since the file-backed paths use foreign syscalls Miri can't run); **`fuzz smoke (F1/F2/F3/F4)`** (§14.5, short bounded `cargo fuzz run` per target — **BLOCKING: a reproducible crash is a real D1–D11 bug and reds the PR**, distinct from the nightly time-boxed lane and never gating an H1 dispatch); **`§14.4d dir-fsync presence (strace)`** (M8 Tier-1, FS-independent syscall-presence guard). Plus the **paths-filtered** per-PR durability gates that only fire when their sources change: **§14.8 H4 Half A** *(`m8-macos.yml` on `macos-latest` — a macOS-only `F_FULLFSYNC`-routing regression is invisible to Linux PR CI because the `cfg(macos)` path does not compile there; `dtruss` Half B stays owner-run per #19)*, the **§14.4b LazyFS** gate *(`lazyfs.yml`, informational)*, and the **§12 fsync-fault poison** gate *(`m8.yml`)*.
- **Nightly (scheduled, staggered off the `:00` rush + `workflow_dispatch`).** **§14.7 benchmarks + gates** *(`bench.yml`, 03:17 UTC; **informational** on hosted runners until a controlled/pinned-governor runner makes the §14.7 thresholds enforceable — same stopgap as the LazyFS gate)*; **§14.5 fuzz (time-boxed F1–F4)** *(`fuzz.yml`, 04:17 UTC; CONTINGENT, NOT the N-CPU-hour §14.13 gate — a crash still reds it)*; **§14.8 H3-physical + §14.4d** *(`m8-dmflakey.yml`, 04:23 UTC + push-to-main; hosted ubuntu VMs reach `dm-flakey`, so these are real gates there — ext4 hard with a source-confirmed block-layer EIO (#16) and a `drop_writes` positive control (#17), xfs/btrfs informational, **best-effort + loud skip** if a runner lacks dm-flakey)*; **§14.10 soak** *(`soak.yml`, 05:17 UTC; a short hosted slice is a CONTINGENT smoke, not the multi-hour gate)*. *(Still to be wired as nightly, not yet automated: full-iteration §14.2/§14.3 and the §14.9 differential — currently the reduced per-PR proptest + the M6 `model_oracle` suite cover these at PR granularity.)*
- **Pre-release / manual:** §14.8 H1 power-pull on target hardware, §14.10 soak *(M9: `soak.yml`, schedule + manual, **contingent** until a multi-hour run on a dedicated runner — a hosted nightly slice is a smoke, not the gate; same stopgap as the LazyFS/bench informational lanes; each run uploads its §5 soak-evidence artifact)*. *(The nightly §14.8 dm-flakey/macOS gates above also accept `workflow_dispatch`; a manual run posts its §5 evidence to the tracking issue as the human sign-off, while the nightly cron stays artifact-only and surfaces regressions as a red build.)*
- **OS matrix:** Linux (primary — sole platform for the §14.8 hardware-durability gate), macOS (dev/correctness — exercises `F_FULLFSYNC`; unit/property/fuzz only, not §14.8). Windows is out of scope for v1 (§8.3).
- **FS matrix (Linux):** ext4, xfs, btrfs (CoW), tmpfs (logic only — never durability claims).
- **FS matrix (Linux):** ext4, xfs, btrfs (CoW), tmpfs (logic only — never durability claims). **Scope, honestly:** the FS matrix is meaningful *only* for the durability/metadata-fault gates whose outcome depends on filesystem journaling and cache semantics — §14.4b LazyFS, §14.8 H3-physical, and the §14.4d negative control (which reproduces behaviorally only on a journal-less FS — see §14.4d). The **byte-level logic** — the record codec, CRC, the recovery classifier/bounded scan, and everything the §14.5 fuzz targets and the §14.6 Miri subset exercise — is **filesystem-independent** (pure in-memory parsing over arbitrary bytes, or plain `pwrite`/`fdatasync` with no FS-specific assumption), so those lanes run once on the runner's default FS and are **not** multiplied across the matrix. Over-claiming an "all-FS fuzz matrix" would be dishonest; the fuzz/codec guarantees hold by construction regardless of FS.

### 14.12 Traceability matrix (invariant → tests)

Expand Down
Loading