FE-829: Build-architect for spec-derived cook plans by kostandinang · Pull Request #185 · hashintel/brunch

kostandinang · 2026-06-09T11:22:36Z

Summary

brunch plan evolves from a sequencer into a build-architect: it now emits fixture-quality, cook-executable plan.yaml and gates every emission on a producer-agnostic executability contract. FE-800 emitted a faithful but un-buildable projection of a spec's requirement graph (conceptual deps, no decomposition, no join slices, no epic seam), so multi-slice epics cooked into "green checks, no assembled artifact." This PR closes that gap — the target shape is the three hand-authored reference fixtures.

What Changed

Executability contract — self-contained PlanContract: checkPlan (one reusable predicate over hand-authored fixtures, emitter output, and any future producer) split into base (authored/reference) and strict emitted (brunch plan output) profiles, plus repairPlan. A shared plan-graph Kahn helper and a Toolchain descriptor that derives verification targets. Synthesizes the multi-slice-epic integration seam, closing FE-800's integration-blind gap. (D167-K, I129-K)
The architect AUTHORS the slice set — architectPlan, a single schema-constrained LLM call, authors decomposed, file-disjoint slices (scaffold + per-behaviour + a join slice owning shared files) with writes and derivedFrom. materializeArchitectedPlan deterministically normalizes them + emits a requirement-provenance coverage sidecar; repairPlan + checkPlan gate the result, with a deterministic projection fallback on authoring throw / parse-fail / uncovered requirement. (I133-K)
File-ownership contract — Slice.writes?: string[] + single-writer-per-file file-write-conflict (a design-class warning, never auto-repaired) + the D160-K coordination-file-layout-namespace amendment. (I132-K)
Toolchain-agnostic cook harness — ToolchainTestRunner + toolchain-driven task builders + de-hardcoded test-writer.md, all resolved from plan.profile (resolveToolchain, bun default). (I130-K)
Eval harness — evaluatePlanShape (plan-eval.ts), a deterministic outer-loop acceptance oracle: a narrow verdict gate (emitted-contract errors / file-write-conflict / missing-writes → reject; never a score threshold the model can game) plus a graded structural-feature metric vector scored against the abstract fixture-design principles (no id/path/count overfit). (I134-K)
Cleanup — retired the now-dormant slice-3 planner: deleted plan-llm-planning.ts, relocated the surviving PlanningEnrichment type into plan-reconciliation.ts, consolidated the duplicate RunModel onto plan-architect.ts. (I131-K retired)

Consequence

No host introspection, no test content (per D160-K amended): the architect reasons only from projected spec truth, toolchain conventions, and inlined exemplars; verification targets are synthesized deterministically and the cook agent authors the concrete tests at run time (A98). The three reference fixtures were refreshed (writes on every slice + the previously-missing integration seam on two of their epics) to score overall === 1. SPEC records A100-K (partially-validated), D160-K (amended), D167-K, I129-K, I130-K, I131-K (retired), I132-K, I133-K, I134-K, and the Future Direction §Cook plan generation arc.

Verification

npm run verify green (check + full test suite + build); per-slice TDD, each commit independently green. Oracles: plan-contract / plan-architect / plan-materialize / plan-emitter / plan-eval / plan-planning-context suites; the three reference fixtures self-test to accept / overall === 1; real brunch plan 31 architect output scores accept / 1.0; opt-in real-LLM eval smoke (PLANNING_REAL_LLM=1 + ANTHROPIC_API_KEY) runs the production architect end-to-end and asserts the emitted plan passes the gate.

kostandinang · 2026-06-09T11:23:00Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

No tech stack baked into prompts, tests, or the emitter: code-writer/test-writer prompts and the test-runner read language/framework/file conventions from a shared ProjectProfile/Toolchain descriptor (the same one the emitter uses to derive verification targets). Adds it as slice 2 of plan-build-architect — the PLAN-named, previously-unowned bun→host adapter — coordinated with FE-813. Bun is the first profile. Also fixes the branch-name reference to ka/fe-829-build-architect (PR #185). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

cursor · 2026-06-09T14:56:57Z

PR Summary

Medium Risk
Large orchestrator/plan emission refactor with LLM-authored decomposition and fallback paths; cook test execution and CLI path resolution behavior change, but contract tests and reference fixtures gate regressions.

Overview
brunch plan is now a build-architect, not a thin sequencer: it replaces planExecutionOrdering / plan-llm-planning.ts with architectPlan (one structured LLM call that authors scaffold, per-behaviour, and join slices with writes and derivedFrom), then materializeArchitectedPlan, repairPlan, and strict checkPlan (emitted). On throw, parse failure, uncovered requirements, or an empty authored slice set, emission falls back to deterministic projection + repair with an architect-failed-fallback-to-projection warning—no second LLM call.

New executability plumbing: plan-contract.ts (checkPlan / repairPlan, base vs emitted profiles, integration-seam synthesis), plan-graph.ts (shared Kahn cycle-break), project-profile Toolchain (derived verification targets), and evaluatePlanShape in plan-eval.ts (narrow accept/reject gate plus graded structural metrics). Slice.writes enforces single-writer-per-file via file-write-conflict warnings (never auto-repaired).

Cook harness reads plan.profile: ToolchainTestRunner, toolchain-driven sliceTestTask / epicVerifyTask, and test-writer.md no longer hardcode Bun. brunch cook resolves relative dir / --out against BRUNCH_LAUNCH_CWD.

The three reference plan.yaml fixtures were refreshed (writes on every slice; integration seams on layered-todo core and resilient-pipeline pipeline). SPEC / PLAN document D167-K, amended D160-K, and invariants I129–I134 (I131-K retired). Opt-in PLANNING_REAL_LLM=1 smoke asserts real architect output passes evaluatePlanShape.

^{Reviewed by Cursor Bugbot for commit 7537bc3. Bugbot is set up for automated code reviews on this repo. Configure here.}

No tech stack baked into prompts, tests, or the emitter: code-writer/test-writer prompts and the test-runner read language/framework/file conventions from a shared ProjectProfile/Toolchain descriptor (the same one the emitter uses to derive verification targets). Adds it as slice 2 of plan-build-architect — the PLAN-named, previously-unowned bun→host adapter — coordinated with FE-813. Bun is the first profile. Also fixes the branch-name reference to ka/fe-829-build-architect (PR #185). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…ct spec Add the plan-build-architect frontier (FE-829, stacked on FE-827): evolve `brunch plan` from a sequencer into a build-architect emitting fixture-quality, cook-executable plans. SPEC: A100-K (contract-completeness assumption), D167-K (emitter guarantees cook-executability via a producer-agnostic PlanContract + deterministic repair), I129-K (every emitted plan satisfies the executability contract), and Future Direction \xc2\xa7Cook plan generation (build-architect arc + deferred D160-K amendment). PLAN: frontier added to Sequencing (Active #8) and Frontier Definitions with the four-slice breakdown; resolves FE-800's integration-blind follow-on. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

No tech stack baked into prompts, tests, or the emitter: code-writer/test-writer prompts and the test-runner read language/framework/file conventions from a shared ProjectProfile/Toolchain descriptor (the same one the emitter uses to derive verification targets). Adds it as slice 2 of plan-build-architect — the PLAN-named, previously-unowned bun→host adapter — coordinated with FE-813. Bun is the first profile. Also fixes the branch-name reference to ka/fe-829-build-architect (PR #185). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

… descriptor Slice 1 of plan-build-architect. brunch plan now gates its output on a producer-agnostic executability contract instead of "always repair, never check". - project-profile.ts: Toolchain descriptor (bun profile) derives verification targets (sliceTarget/epicTarget) so test paths are no longer hardcoded. - plan-graph.ts: shared breakDependencyCycles (one Kahn lex-smallest policy) reused by reconcilePlan and the contract so the two cannot drift. - plan-contract.ts: total/pure checkPlan + deterministic repairPlan. Resolves the seam-invariant vs read-only-fixtures conflict with a base/emitted profile split: a multi-slice epic missing its integration seam is a warning under base (authored fixtures pass check unmodified) and an error under emitted (brunch plan output); repairPlan always synthesizes the seam. Mechanical-class repair (drop self/dangling deps, cycle-break, mint per-slice target, synthesize epic seam) is idempotent; design-class (uncovered requirement) is surfaced, not invented. - plan-emitter.ts: repairs after reconcile, surfaces synthesized-integration-seam as a typed EmitterWarning; every emitted plan passes the strict emitted profile. - plan-reconciliation.ts: routes target-mint through the toolchain and reuses the shared Kahn helper (behavior preserved). Closes FE-800's integration-blind / "green checks, no artifact" gap for the static half. No LLM, no D160-K change. SPEC: I129-K (profiles), A100-K (partially-validated), D167-K (profile-split refinement).

…hain) Make the toolchain spec-derived plan truth, mirroring the Plan.mode precedent (D164-K), instead of hardcoding the bun default everywhere. - project-profile.ts: add ProfileId union, a brunch profile (TypeScript + vitest, co-located *.test.ts / *.integration.test.ts), a PROFILES registry, and resolveToolchain(profile?) that falls back to bun for an absent/unknown profile (same lenient default loadPlan applies to mode). - types.ts: Plan.profile?: ProfileId (optional — no churn to existing Plan literals). - plan-projection.ts: CompletedSpecSnapshot.profile carried onto the emitted plan so the emitter and brunch cook resolve the same Toolchain. - plan-reconciliation.ts: preserve profile on the reconciled plan. - plan-emitter.ts: resolveToolchain(projected.profile) replaces the hardcoded defaultToolchain. Tests: brunch profile yields co-located targets; absent profile defaults to bun's tests/<id>.test.ts.

The cook execution harness no longer hardcodes bun; it reads the test stack from the Toolchain resolved from plan.profile (I130-K). - project-profile.ts: Toolchain gains testCommand(target) + testConventions; bun (bun test / bun:test) and brunch (vitest run / vitest) implemented. - test-runner.ts: BunTestRunner -> ToolchainTestRunner (runs toolchain.testCommand). - pi-actions.ts: runBunTest -> toolchain-driven runTest; extracted sliceTestTask / epicVerifyTask builders that inject toolchain.testConventions; createPiActions({ toolchain }). - prompts/test-writer.md: names no framework — conventions come from the task. - cook-cli.ts: resolve toolchain from plan.profile, pass to runner + actions. Tests: project-profile.test.ts; ToolchainTestRunner honors an arbitrary toolchain command; pi-actions task builders carry conventions + a guard that the prompt has no hardcoded stack. Full orchestrator suite green. SPEC: §Cook plan generation marked landed, new invariant I130-K, FE-813 "still unowned" note closed. PLAN: slice 2 done, slice 3 next.

…re exemplars (slice 3) Enriches the planning prompt from spec truth while keeping the LLM stage's output schema unchanged (classify/group/order the existing req-* slices only): - projectPlanningContext lifts spec relation edges into req-* slice space (verifies consumed as criterion->requirement ownership bridge only; unresolved/ self edges dropped; deduped + stable-sorted) - planning prompt now carries each slice's acceptance criteria, the projected relation hints, and the three reference fixtures inlined as comment-stripped few-shot exemplars (plan-exemplars.ts, embedded constants) - build-architect framing + no-invent/split/rename guardrail - emitter threads the snapshot-derived context end-to-end Deterministic plumbing is unit-tested; model output quality deferred to the eval harness (slice 5) + opt-in real-LLM smoke. SPEC I131-K. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

…le (slice 4A) Deterministic half of slice 4 (oracle-scoped split; 4B LLM authoring deferred): - Slice.writes?: string[] — repo-relative POSIX paths a slice exclusively mutates (exact paths only, no globs/dirs) - checkPlan: file-write-conflict finding — a path declared by >=2 slices is a design-class WARNING (never error, never auto-repaired); intra-slice dup paths deduped first so they cannot self-conflict - repairPlan preserves writes verbatim, never moves ownership or mints a join slice; loadPlan round-trips the field (absent -> undefined) - a join slice is the sole writer of a shared coordination file that depends_on the slices it joins — NOT a multi-writer exception D160-K amended to permit a coordination file-layout namespace under strict no-host-introspection / no-test-content guards (cook agent authors tests per A98). SPEC I132-K; A100-K file-disjointness half opened. 4B (emitter/LLM authors writes + requirement decomposition + join synthesis) deferred to a later sub-slice + the slice-5 eval harness. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

… 4B) The emitter mainline now AUTHORS the slice set instead of enriching the 1:1 projected req-* slices: - plan-architect.ts: architectPlan() — schema-constrained LLM call emitting authored slices (scaffold + per-behaviour + join) each with writes (file ownership) + derivedFrom (requirement provenance), plus nonBuildableRequirementIds. No host introspection, no test content (D160-K amended; cook agent writes tests per A98). Reuses the exemplars + planning context; exemplars now teach single-writer join ownership. - plan-materialize.ts: deterministic normalize of the draft -> Plan + coverage sidecar (filters unknown req refs keeping the slice, drops self/dangling deps, breaks cycles via the shared Kahn policy, resolves epic membership from slice.epic_id, appends criteria into definition prose, synthesizes verification targets). - checkPlan: generalized requirement-provenance coverage (requirementIds/coveredRequirementIds/nonBuildableRequirementIds); legacy 1:1 form retained. - emitter: architect -> materialize -> repair -> checkPlan(emitted+coverage); deterministic projection fallback (reconcile-empty + repair, no second LLM call) on throw / parse-fail / uncovered requirement. file-write-conflict is surfaced, never silently shipped. EmitPlanResult.planningResult -> architectResult. The slice-3 planExecutionOrdering enrichment stage is superseded on the mainline (I131-K -> I133-K) and dormant pending retirement. SPEC I133-K; PLAN slices 1-4 done. Deterministic plumbing unit-tested (architect schema/prompt/failure, materializer coverage/provenance/purity, emitter authored happy-path + fallbacks + conflict + toolchain, plan-runner). Decomposition QUALITY deferred to slice 5 (eval harness) + opt-in real-LLM smoke. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

… refresh (slice 5) Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

…ce-5 hardening) Derive multiSliceEpicsMissingSeam from the contract findings evaluatePlanShape already computes, instead of re-detecting it with a local INTEGRATION_KIND constant + verification predicate, so eval and checkPlan cannot drift on the seam definition. Also narrow the param to Omit<ContractExpectations,'profile'> so the forced emitted profile is a compile-time fact, not a silent override. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Delete plan-llm-planning.ts (+ its test) — the dormant slice-3 enrichment stage superseded on the mainline by the authoring architect (slice 4B). Relocate the only load-bearing survivor, the PlanningEnrichment type (reconcile's deterministic-fallback input contract), into plan-reconciliation.ts next to its consumer; consolidate the duplicate RunModel type onto plan-architect.ts (plan-runner now imports it there). The Zod planningEnrichmentSchema and the dead defaultRunModel go with the deleted function. Refresh stale comment references. I131-K retired. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

… gate Env-gated (PLANNING_REAL_LLM=1 + ANTHROPIC_API_KEY; it.skip otherwise) block in plan-emitter.test.ts runs the production architect end-to-end on the brunch-graphs-snapshot.json fixture and asserts evaluatePlanShape(plan).verdict === 'accept'. Restores the opt-in real-LLM coverage lost when plan-llm-planning.test.ts was retired, now homed at the I134-K acceptance gate. Skipped by default; CI/verify need no credentials. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Cursor Bugbot (medium): an architect draft with no slices that marks every projected requirement non-buildable passes checkPlan under the emitted profile (coverage is vacuously satisfied, no slices/epics to fault) and would ship an empty, cook-executable-looking plan.yaml that does no work. Guard the emitter: zero authored slices for a non-empty requirement universe is an authoring failure -> deterministic projection fallback (which always yields slices). A genuinely empty spec still emits empty. Regression test added. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Cursor Bugbot (high): materialize hardcoded depends_on: [] on every output epic, discarding architect-authored cross-epic gates (e.g. cli waiting on core), so emitted multi-epic plans could run downstream epics before upstream ones finished. Propagate epic depends_on, cleaned with the same policy as slices: deps on dropped/empty/unknown epics are filtered and cycles are broken via the shared Kahn pass. Regression tests added. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Cursor Bugbot (medium): greenfield promotion resolved --out with resolve() against the CLI child's cwd, which via bin/brunch is the package root, not the user's project dir — so a relative --out could promote into the wrong tree. Resolve both the positional dir and --out against BRUNCH_LAUNCH_CWD (the same launch cwd runCook uses). Regression test added. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

… positive The preflight 'Todo comments' check flags any line where the PR's ticket id co-occurs with the case-insensitive substring 'todo'. The 'layered-todo' reference fixture collided with the FE-829 attributions in memory/SPEC.md and memory/PLAN.md (5 lines), failing the check. Reword those lines to refer to the fixtures collectively / by their 'core' and 'pipeline' epic names (which uniquely identify them) so no FE-829 line contains 'todo'. Docs only; no behavior change. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Cursor Bugbot (medium): materialize filtered architect-authored epic depends_on edges to surviving epic ids silently, while slice-level dangling deps emit dropped-dependency-nonexistent-id — so a mistyped or stale epic gate could vanish and cook could schedule epics in the wrong order with no audit trail. Emit a typed dropped-epic-dependency-nonexistent-id warning (epicId + missingId) for every such drop, wired through the emitter's exhaustive classifier/formatter. Regression tests added. Amp-Thread-ID: https://ampcode.com/threads/T-019eac29-22fc-7298-a25d-e1a940e39d68 Co-authored-by: Amp <amp@ampcode.com>

Co-authored-by: Kostandin Angjellari <kostandinang@users.noreply.github.com>

Align package-setup definition with its writes list so only barrel-exports touches src/index.ts. checkPlan and the architect schema now error on duplicate slice ids before emission. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit a165763. Configure here.}

Architect schema and checkPlan now error on duplicate epic ids. Epic-level Kahn cycle breaks emit cycle-break-dropped-epic-edge with epicId instead of reusing the slice-oriented cycle-break-dropped-edge code. Co-authored-by: Cursor <cursoragent@cursor.com>

kostandinang mentioned this pull request Jun 9, 2026

FE-827: Orchestrator promote-back only adopts a target that is a repo root #183

Merged

kostandinang changed the title ~~FE-829: Register plan-build-architect frontier + executability-contract spec~~ FE-829: Build-architect for spec-derived cook plans Jun 9, 2026

kostandinang marked this pull request as ready for review June 9, 2026 14:56