Skip to content

FE-864: Brownfield feature delivery from spec — detect, classify, probe, oracle, promote, serve#212

Merged
kostandinang merged 45 commits into
mainfrom
ka/fe-864-orchestrator-enhancements
Jun 19, 2026
Merged

FE-864: Brownfield feature delivery from spec — detect, classify, probe, oracle, promote, serve#212
kostandinang merged 45 commits into
mainfrom
ka/fe-864-orchestrator-enhancements

Conversation

@kostandinang

@kostandinang kostandinang commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Stack Context

This is the middle layer of the cook/brunch-serve stack — the FE-864 "orchestrator improvements" umbrella. It builds the end-to-end brownfield delivery pipeline on top of the pi-agent foundations (#194): take a completed spec, detect the repo's toolchain, run the cook agent, verify the result against a live app, promote it, and drive the whole thing from one brunch serve command. The operational-hardening PR (#224) stabilizes what this introduces.

Consolidates the FE-864 umbrella and its children (FE-867, FE-871, FE-872, FE-875, FE-876, FE-877, FE-878, FE-879) into one PR.

What?

The brownfield delivery pipeline, in dependency order:

  • FE-864 umbrella — orchestrator brownfield enhancements plan; CLI surface settled as plan / cook / serve with brigade names as phases.
  • FE-867 agent extension host — mode-neutral, dual-mode pi-harness contract that the pipeline stages share.
  • FE-871 brunch toolchain detectiondetectProfile, monorepo-robust test-dir + workspace-runner detection, generated tests co-located in the repo's own test dir, fail-loud on ambiguous evidence; wired into plan emission.
  • FE-872 install-failure classification — split infra-vs-test failures, name the toolchain cause in the halt reason, unify test execution on one runner behind a verification seam.
  • FE-875 app runtime probe — boot the app, HTTP-probe it, classify reachability; harness-owned buildProbeSpec (port allocation + URL assembly) with bounded, strict-deadline probe calls.
  • FE-876 integration oracle — fold probe reachability into the verify-epic verdict (Half A) behind a reachability-intent seam with an injectable ProbeGrounder (Half B).
  • FE-877 brownfield promotion — commit the cook result onto cook/<runId> via git plumbing.
  • FE-878 brunch serve (capstone) — one-shot plan-then-cook; CLI presentation seam, Ink TUI presenter (egg logo + brigade tracker), live waiting-state panel, shared completed-spec gate, launch-cwd threading. Closes Arc 1.
  • FE-879 lazy cook worktrees — lazy per-slice cook worktrees + shared node_modules; fail loud on slice-id/parent collisions.

Why?

Foundations (#194) made the cook agent safe and portable to run; this layer makes it deliver. Brownfield delivery from a spec needs the orchestrator to understand an unfamiliar repo (detection), tell real failures from infra noise (classification), confirm the result actually runs (probe + integration oracle), land it without clobbering the working tree (promotion onto cook/<runId>), and present all of it coherently from a single command (brunch serve). FE-879 keeps per-slice runs cheap so the pipeline scales.

Folded from the FE-864 umbrella down through its linear children, with FE-879 (a side branch off the umbrella) folded in.

kostandinang commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

@cursor

cursor Bot commented Jun 16, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Touches git promotion, parallel worktree isolation, and cook halt/oracle paths—high leverage for brownfield correctness but well covered by contract and integration tests; Ink TUI is presentation-only with documented real-terminal gaps.

Overview
Arc 1 brownfield delivery closes the loop from a completed spec to a reviewable cook result: plan-time toolchain detection and test-dir co-location feed emission; cook classifies infra vs test failures and halts with an honest toolchain reason; runProbe / buildProbeSpec boot and HTTP-check reachability for the integration oracle (verify-epic + injectable ProbeGrounder seam); completed runs auto-promote onto cook/<runId> via git plumbing while greenfield keeps opt-in --out.

brunch serve chains plan then cook with launch-cwd threading; lazy slice worktrees and shared node_modules symlinks cut brownfield startup cost. CLI output moves behind emit(CookEvent) with plain/ink/silent presenters (brigade tracker, activity panel; ink added as a dependency).

A new agent-extension-host contract (metadata-only, no imports) documents dual elicit / execute modes and is proven against cook actions and interview tools. PLAN.md and SPEC.md record frontier status (D168, I135-K, I136-K, A98 partial validation) and Arc 2 horizon.

Reviewed by Cursor Bugbot for commit b44eaa1. Bugbot is set up for automated code reviews on this repo. Configure here.

@kostandinang kostandinang force-pushed the ka/fe-843-toolchain-profiles branch from 8837e9b to b84fbda Compare June 16, 2026 23:45
@kostandinang kostandinang force-pushed the ka/fe-864-orchestrator-enhancements branch from 95ac829 to eb6e902 Compare June 16, 2026 23:45
@kostandinang kostandinang force-pushed the ka/fe-843-toolchain-profiles branch from b84fbda to e4e3afe Compare June 16, 2026 23:55
@kostandinang kostandinang force-pushed the ka/fe-864-orchestrator-enhancements branch 2 times, most recently from 39d50cb to bb36ee8 Compare June 17, 2026 08:51
@kostandinang kostandinang force-pushed the ka/fe-843-toolchain-profiles branch from e4e3afe to 8557ba7 Compare June 17, 2026 08:51
@kostandinang kostandinang changed the title FE-864: Plan orchestrator brownfield enhancements FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec Jun 17, 2026
@kostandinang kostandinang requested a review from lunelson June 18, 2026 07:57
@kostandinang kostandinang self-assigned this Jun 18, 2026
@kostandinang kostandinang changed the base branch from ka/fe-843-toolchain-profiles to graphite-base/212 June 18, 2026 16:47
@kostandinang kostandinang changed the base branch from graphite-base/212 to ka/fe-841-pi-sdk-embed June 18, 2026 16:47
kostandinang and others added 20 commits June 19, 2026 16:39
… probe

runProbe's readiness poll and feature fetch used bare global fetch with
no timeout. A server that accepts a connection but never responds would
block await fetch forever — the wall-clock READY_TIMEOUT_MS is only
checked between poll attempts, so it never fired, hanging the probe and
the whole cook. Each fetch now carries a per-call AbortSignal.timeout;
timeouts are overridable so the no-hang behavior is unit-tested fast.

Amp-Thread-ID: https://ampcode.com/threads/T-019ecb9a-9a08-733b-833d-76885fc8243a
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
runCook reads opts.dir raw — the launch-cwd default lives only in
parseCookArgs, which serve bypasses. With dir:'' cook resolved the
just-emitted plan path against process.cwd() and would clone '' for
brownfield, so serve only worked when greenfield and
process.cwd()===launchCwd===project root.

serveCookOptions/runServe now take the resolved cook dir; cli passes
launchCwd (the same dir plan writes the plan to). Corrects the stale
test that asserted cook.dir===''.

Amp-Thread-ID: https://ampcode.com/threads/T-019ecb9a-9a08-733b-833d-76885fc8243a
Co-authored-by: Amp <amp@ampcode.com>
- I126-K: name the shared runVerification seam (FE-872); evaluateVerificationTargets is deleted
- decision 166: brownfield promotion is no longer a follow-on — landed as decision 168 (FE-877)
- decision 168 (new): brownfield auto-promotion (plumbing-only) + brunch serve capstone (FE-877/878)
- I128-K: --out is greenfield-only (brownfield auto-promotes)
- I135-K (new): brownfield promotion never touches the user's checkout

Amp-Thread-ID: https://ampcode.com/threads/T-019ecb9a-9a08-733b-833d-76885fc8243a
Co-authored-by: Amp <amp@ampcode.com>
The serve and plan branches duplicated the spec gate verbatim
(resolveBrunchProject -> createDb -> existence check -> snapshot ->
completeness assert -> db close + uniform error). withCompletedSpec now
owns it; parsing is a thunk so parse errors report through the same
'Failed to run brunch <command>' channel. Pure refactor — cli.test.ts
green.

Amp-Thread-ID: https://ampcode.com/threads/T-019ecb9a-9a08-733b-833d-76885fc8243a
Co-authored-by: Amp <amp@ampcode.com>
Introduces a single emit(CookEvent) presentation boundary so terminal
output stops being smeared across console.error/log() in the orchestrator
CLI. Foundation: presenter.ts root + presenter/{events,bus,select,plain,
silent}.ts.

- selectPresenter(command,isTTY,ci,reporterFlag): pure decision table →
  plain (CI/non-TTY/default) | silent (agent, keeps stdout JSONL-clean) |
  ink (interactive TTY; falls back to plain until slice 2).
- CookBus: synchronous fan-out; a thrown presenter is downgraded to a
  process warning so presentation can never abort a run.
- PlainPresenter: CookEvent → stderr, byte-exact for the plan arms; sink
  injectable for the golden differential.
- plan-runner migrated to emit CookEvents; cli.ts plan/serve wired through
  createCookBus. cook left untouched (still behavior-preserving).

Oracle per SPEC I136-K: plan-runner.test.ts now drives a capturing bus and
asserts the same stderr; npm run verify green.

Slice 1b (cook surface + injected-clock elapsed timer) queued in
memory/CARDS.md.

Co-Authored-By: Claude <noreply@anthropic.com>
Routes cook/serve terminal output through the emit(CookEvent) boundary,
completing the seam across all three commands.

- cook-cli: banner / completion summary / promotion / petrinaut blocks and
  the early-exit diagnostics now emit {kind:'line'} through the bus; the
  petrinaut-setup log is bus-backed. runCook takes a bus (defaults to
  createCookBus('cook')); serve shares one bus across plan+cook.
- pi-actions: per-action log()/logVerbose() become structured action/
  verbose CookEvents; the module is now console-free. The module-level
  Date.now() elapsed timer is gone — the presenter owns it.
- PlainPresenter: gains an injected clock (I136-K). A cook-start event
  seeds runStart; the elapsed prefix is computed at render time, so the
  cook surface now has a deterministic byte-exact golden.

Verified: presenter goldens (plan + cook arms incl. fake clock),
brownfield-smoke runs cook end-to-end through the bus, npm run verify green.
ink still falls back to plain — that's slice 2.

Co-Authored-By: Claude <noreply@anthropic.com>
Makes the `ink` backend real (it no longer falls back to plain on a TTY).

- format.ts + clock.ts: line formatting + the elapsed clock extracted from
  PlainPresenter so the plain and Ink backends share one formatter and can't
  drift. PlainPresenter is now a thin sink over formatCookEvent.
- phase.ts: nextPhase — a pure, monotonic brigade tracker
  (prep→recipe→cook→taste→plate→serve) projected from the event stream.
  Coarse for now (post-hoc events); precise in-flight transitions are 2b.
- run-store.ts: folds CookEvents into { phase, lines } with a stable
  snapshot for useSyncExternalStore.
- ink/: egg-logo.ts (ANSI mark), app.tsx (egg header + brigade strip +
  bounded activity log), ink-presenter.tsx (renders to STDERR; stdout stays
  reserved). makePresenter('ink') now returns InkPresenter.

Adds ink@^7 + ink-testing-library@^4 (React 19.2 satisfies the peer dep).
Verified: phase/run-store units, ink-testing-library frame (egg + active
phase + activity line), non-TTY path still plain (brownfield-smoke), full
build bundles the tsx. Real-terminal walkthrough is outer-loop debt; the
dead-air waiting fix is slice 2b.

Co-Authored-By: Claude <noreply@anthropic.com>
Closes the dead-air problem: long waits now show what brunch is doing.

- events: activity-start / activity-progress / activity-end.
- pi-actions: runPi self-brackets every agent session (start → finally end)
  with a throttled KB heartbeat off its token stream; the test-run and probe
  waits bracket via a small withActivity helper. All close in finally, so a
  spinner can't hang — covered by a test that fails the session mid-wait.
- cook-cli: promotion brackets via a `promoting` helper.
- run-store: a pending map (start adds, progress updates detail, end removes);
  activity events stay out of the scrolling log.
- ink: PendingPanel renders a live spinner + label + elapsed + detail, with a
  tick interval that runs only while something is pending. Plain/CI prints one
  `⋯` start line per wait.

Known limit: test-runner uses blocking spawnSync, so the spinner freezes (but
stays labeled) during a test run; the async pi session animates. Real-terminal
walkthrough is outer-loop debt.

Verified: run-store pending units, ink frame (panel shows/clears), balanced
brackets incl. on session failure, npm run verify green.

Co-Authored-By: Claude <noreply@anthropic.com>
ln-review caught that nothing ever called bus.dispose() — harmless for
plain/silent, but on a real TTY the Ink app was never unmounted, so
`brunch cook`/`serve` would hang after the run.

- withCookBus(command, fn): builds the bus, runs the work, and disposes it
  (→ unmounts Ink) in finally. One owner, no split ownership.
- runCook takes a required bus (drops the in-cook createCookBus default).
- cli.ts cook/plan/serve paths run through withCookBus; serve's single
  shared bus is disposed once after the cook stage.

Verified: withCookBus disposes on success and on throw (spy); CookBus.dispose
fan-out test stands; npm run verify green. Remaining real-terminal debt is now
purely visual.

Co-Authored-By: Claude <noreply@anthropic.com>
)

The cook banner/summary text had no oracle (the migration preserved it
verbatim but nothing guarded against drift). Extract cookBannerLines /
cookSummaryLines as pure functions and golden-test them; runCook feeds their
output to the bus. Covers completed + halted runs incl. the epic/slice tree.

npm run verify green.

Co-Authored-By: Claude <noreply@anthropic.com>
…arks

Per feedback: drop the egg, use the "brunch" wordmark tinted with the
brunch.ai brand gradient (HASH blue→indigo→violet, one hex per letter), and
keep the brigade/status glyphs as the original monochrome marks (✓ ◐ ○)
rather than emoji. egg-logo.ts → wordmark.ts. Plain/CI backend stays
untinted. Ink frame tests updated.

Co-Authored-By: Claude <noreply@anthropic.com>
The panel re-rendered every 120ms recomputing toFixed(1) elapsed, so the
number jittered at the decimal. Add formatElapsed (whole seconds under a
minute, m:ss above), use it in the panel, and slow the spinner tick to 250ms.
The static action-log prefix (a fixed record) keeps its one-decimal form.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@kostandinang kostandinang force-pushed the ka/fe-864-orchestrator-enhancements branch from 38400fc to b44eaa1 Compare June 19, 2026 15:39
@kostandinang kostandinang changed the base branch from graphite-base/212 to main June 19, 2026 15:39
@kostandinang kostandinang dismissed lunelson’s stale review June 19, 2026 15:39

The base branch was changed.

@kostandinang kostandinang requested a review from lunelson June 19, 2026 15:50
@kostandinang kostandinang added this pull request to the merge queue Jun 19, 2026
Merged via the queue into main with commit f5e2f5d Jun 19, 2026
12 checks passed
@kostandinang kostandinang deleted the ka/fe-864-orchestrator-enhancements branch June 19, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants