This guide consolidates the expected behavior for conditional gating, design‑by‑contract, routing, and retries in the state‑machine engine. It follows safety‑critical software principles (detect → isolate → recover → report) while remaining practical for CI/PR automation.
Assume vs. Guarantee — Do’s and Don’ts
Do
- Use
assumefor pre‑execution prerequisites that do not depend on this step’s output (env, memory, upstream results).- Keep expressions pure (no time/random/network); short and deterministic.
- Use
guaranteeto assert properties of this step’s produced output (shape, size caps, idempotency markers, control signals).- For critical steps, pair both:
assume(preflight) +guarantee(post‑exec safety lock).Don’t
- Don’t reference
outputof the same step inassume(it runs before execution).- Don’t put policy thresholds into
guarantee—usefail_iffor policy/quality gates.- Don’t rely on side‑effects or external clocks in expressions.
- Deterministic evaluations: expressions are pure (no side‑effects, time/network), evaluated in a sandbox.
- Fail‑secure defaults: evaluation errors pick the safest behavior (skip or fail closed), and are logged.
- Bounded retries: never unbounded loops; per‑scope caps and loop budgets.
- Isolation: failures do not cascade unless explicitly permitted.
- Auditability: every decision is journaled with cause, scope, and timestamps; JSON snapshots are exportable.
Criticality classifies a step by the operational risk it carries. The engine uses it to pick safe defaults for contracts, gating, retries, loop budgets, and side‑effects. continue_on_failure only controls dependency gating; it does not define criticality.
Declare criticality on each check:
checks:
post-comment:
type: github
criticality: external # external | internal | policy | infoMeanings:
- external
- Step mutates an external system (GitHub ops, HTTP methods ≠ GET/HEAD, file writes).
- Defaults: contracts required;
continue_on_failure: false; retries only for transient faults; tighter loop budgets; suppress downstream mutating actions when contracts orfail_iffail; idempotency or compensation expected.
- internal
- Step drives routing or fan‑out (forEach parents; on_* with goto/run; memory used by guards).
- Defaults: contracts required for route integrity;
continue_on_failure: false; tighter loop budgets (recommended 8); retries only transient; treat loops/recirculation conservatively.
- policy
- Step enforces permissions/compliance gates (permission checks, org policy, human‑in‑the‑loop approvals).
- Defaults: contracts required;
continue_on_failure: false; logical violations are failures (no auto‑retry), downstream mutating actions blocked until remediated.
- info
- Read‑only or low‑risk compute.
- Defaults: contracts recommended (not required); may allow
continue_on_failure: true; standard loop budgets and retry bounds.
Precedence & inference:
- Explicit
criticalityon a check takes precedence over tags or heuristics. - If
criticalityis omitted, the engine may infer:- mutating providers → external; forEach parents or on_* goto/run → control‑plane; strong policy gates → policy; otherwise non‑critical.
Overriding defaults:
- You can override any default (e.g., set
continue_on_failure: trueor adjust budgets) per check. Criticality sets sensible baselines; it does not lock you in.
Below is the exact behavior for each construct depending on criticality. “Skip” means provider is not executed and it does not count as a run.
-
if (plan‑time)
- Non‑critical: false/error → skip; dependents may run if OR‑deps satisfy or they are unrelated.
- Critical: same skip; because
continue_on_failure: falseis default, downstream mutators must depend on this step and will skip.
-
assume (pre‑exec)
- Non‑critical: false/error → skip (skipReason=assume); no retry.
- Critical: false/error → skip and block downstream side‑effects via dependency gating. If you need an explicit failure (not a skip), add a guard step (see Example C below) or (optional)
assume_mode: 'fail'when available.
-
guarantee (post‑exec)
- Non‑critical: violation → add
contract/guarantee_failedissue; mark failure; routeon_fail; no auto‑retry unless remediation exists. - Critical: violation → mark failure; suppress downstream mutating actions (dependents should depend_on this step). Route
on_failto remediation; retries only for transient exec faults, not logical ones.
- Non‑critical: violation → add
-
fail_if (post‑exec)
- Non‑critical: true → failure; bounded retry only for transient faults; otherwise remediation.
- Critical: true → failure; do not auto‑retry logical failures; block side‑effects; route remediation with tight caps.
-
transitions / goto
- Both: prefer declarative
transitionsfirst; respect per‑scope loop budgets (default 10; critical recommended 8). Exceeding budget addsrouting/loop_budget_exceededand halts routing in that scope.
- Both: prefer declarative
Numeric defaults (recommended)
- Retries: max 3 (non‑critical), max 2 (critical), exponential backoff with jitter (e.g., 1s, 2s, 4s ±10%).
- Loop budgets: 10 (non‑critical), 8 (critical/control‑plane branches).
-
Purpose
if: Scheduling gate — decides whether the step should be scheduled in this run.assume: Preconditions — must hold immediately before executing the provider.guarantee: Postconditions — must hold for the result the provider produced.fail_if: Failure detector — declares which results count as failures.
-
When it runs
if: before scheduling (earliest).assume: after scheduling, right before calling the provider.guarantee: immediately after provider returns.fail_if: immediately after provider returns (can co‑exist withguarantee).
-
Inputs visible to the expression
if: event, env, filesChanged meta, previous check outputs (current wave), memory (read‑only helpers).assume: same asif, plus fully resolved dependency results for this scope.guarantee/fail_if: same asassume, plus the step’s own output/result.
-
Effect on execution
iffalse (or error): step is skipped and never scheduled.assumefalse (or error): step is skipped right before execution; provider is not called.guaranteeviolation: step has executed; violation adds issues; routeson_fail.fail_iftrue: step has executed; marks failure; routeson_fail.
-
Stats/journal
if/assumeskip: recorded as a skip; does not count as a run; journal contains an empty result entry.guarantee/fail_if: counted run; issues recorded; journal contains the full result.
-
Routing & dependents
- Skips (
if/assume) propagate gating to dependents unless OR‑deps satisfy orcontinue_on_failureapplies on an alternate path. - Failures (
guarantee/fail_if) route viaon_failwith bounded retries/remediation.
- Skips (
When to choose which
- Use
ifwhen you can decide at plan time whether a step should even be considered (tags, events, coarse repo conditions). - Use
assumewhen prerequisites depend on dynamic dependencies or environment right before execution (e.g., tools bootstrapped by apreparestep). - Use
guaranteewhen the provider must produce outputs that satisfy invariants (shape, counts, idempotency confirmations). - Use
fail_ifwhen policy/thresholds on the produced results define failure (test counts, lints, security finding thresholds).
- Purpose: schedule a step only when conditions are met (event, env, prior outputs).
- Behavior:
iftrue → run.iffalse or evaluation error → skip with reasonif_condition(does not count as a run); dependents skip unless alternate OR‑deps satisfy.
- Example:
checks:
lint:
type: command
on:
- pr_opened
- pr_updated
if: "filesCount > 0 && env.CI === 'true'"
exec: npx eslint .- Purpose: non‑negotiable prerequisites before a step executes.
- Behavior:
- Any
assumeexpression false → skip with reasonassume. In critical branches, this blocks dependent mutating steps via dependency gating. - No automatic retry unless a defined remediation can satisfy the precondition.
Important: assume runs before the provider; do not reference this step’s own output inside assume. Use dependency results (e.g., outputs['dep']) or environment/memory. Assertions about this step’s produced data belong in guarantee.
- Example with remediation:
checks:
prepare-env:
type: command
exec: node scripts/bootstrap.js
analyze:
type: command
depends_on:
- prepare-env
assume:
- "env.TOOLING_READY === 'true'"
- "Array.isArray(outputs_history['prepare-env']) ? true : true"
exec: node scripts/analyze.js- Purpose: invariants that must hold after a step completes.
- Behavior:
- Violations add issues with ruleId
contract/guarantee_failed, mark failure, and route viaon_fail. - In critical branches, violation blocks downstream mutating actions (dependents should be gated on this step) and is not auto‑retried as a logical failure.
- Violations add issues with ruleId
- Example:
checks:
summarize:
type: command
exec: "node -e \"console.log('{\\"items\\":[1,2,3]}')\""
guarantee:
- "output && Array.isArray(output.items)"
- "output.items.length > 0"
on_fail:
run:
- recompute
recompute:
type: command
exec: node scripts/recompute.js- Purpose: codifies “this result means failure.”
- Behavior:
- If true → mark step failed, append
<check>_fail_ifissue, and routeon_fail. - Evaluation errors → log, treat as not triggered (prefer separate system issue).
- If true → mark step failed, append
- Example with bounded retry/backoff:
checks:
tests:
type: command
exec: npm test -- --runInBand
fail_if: "output.summary.failed > 0"
on_fail:
retry: { max: 2, backoff: { mode: exponential, delay_ms: 1000 } }
run:
- collect-logs
collect-logs:
type: command
exec: node scripts/collect-logs.jsUse transitions for clear, testable routing without inline JS logic. If none match, the engine falls back to goto_js/goto.
Helpers available inside when: outputs, outputs_history, output, event, memory, plus any/all/none/count.
checks:
extract-facts:
type: ai
forEach: true
on_finish:
transitions:
- when: "any(outputs_history['validate-fact'], v => v && v.is_valid === false) && event.name === 'issue_opened'"
to: issue-assistant
- when: "any(outputs_history['validate-fact'], v => v && v.is_valid === false) && event.name === 'issue_comment'"
to: comment-assistant
validate-fact:
type: ai
depends_on:
- extract-facts
issue-assistant:
type: ai
on_success:
transitions:
- when: "event.name === 'issue_comment' && output?.intent === 'comment_retrigger'"
to: overview
goto_event: pr_updatedA step is critical when it meets any of:
- External side effects (mutating: GitHub ops, HTTP methods ≠ GET/HEAD, file writes).
- Control‑plane impact (forEach parents, on_* that drive goto/run, memory used by conditions).
- Safety/policy gates (permission checks, strong
fail_if/guarantee). - Irreversible/noisy effects (user‑visible posts, ticket creation).
Pragmatic marking today:
- Use
tags: [critical](and optionallyinternal,external). - Heuristics: treat mutating providers as critical by default.
Policy matrix (default)
- Non‑critical:
assumeskip (no retry);guarantee→ issues + on_fail;fail_if→ failure; retries only for transient faults. - Critical:
assumeviolation blocks dependents;guaranteeviolations prevent downstream side‑effects;fail_ifretried only if transient; tighter loop budgets.
continue_on_failure is a dependency‑gating knob: it decides whether dependents may run after this step fails. It does not fully define criticality. A NASA‑style notion of criticality also governs contracts, retries, loop budgets, side‑effect controls, and escalation paths.
Recommended practice:
- Use
continue_on_failureto control gating per edge, but classify steps explicitly as critical or not. - Express criticality today via tags, and (optionally) promote to a dedicated field later.
- Using tags (immediately usable):
checks:
post-comment:
type: github
tags:
- critical
- external
on:
- pr_opened
op: comment.create
assume:
- "env.ALLOW_POST === 'true'"
guarantee:
- "typeof output.id === 'number'"
continue_on_failure: false
on_fail:
retry: { max: 2, backoff: { mode: exponential, delay_ms: 1500 } }- Using a proposed field (future‑proof, clearer intent):
checks:
label:
type: github
criticality: external # or: internal | policy | info
on:
- pr_opened
op: labels.add
values:
- "reviewed"
assume: "isMember()"
guarantee: "Array.isArray(output.added) && output.added.includes('reviewed')"Engine policy derived from criticality (summary):
- Critical (external/control‑plane/policy):
- require meaningful
assumeandguarantee. continue_on_failure: falseby default.- retries only for transient faults, with tight caps and backoff.
- lower routing loop budgets for branches this step drives.
- suppress downstream mutating side‑effects when guarantees fail.
- require meaningful
- Non‑critical:
assume/guaranteerecommended but not mandatory.- may set
continue_on_failure: trueto keep non‑critical branches running.
- Non‑critical compute that may fail without stopping the pipeline:
checks:
summarize:
type: ai
tags:
- info
on:
- pr_opened
- pr_updated
continue_on_failure: true
fail_if: "(output.errors || []).length > 0"- External (critical) — posting a PR comment with strict contracts and bounded retries:
checks:
post-comment:
type: github
tags:
- critical
- external
on:
- pr_opened
op: comment.create
assume:
- "isMember()"
- "env.DRY_RUN !== 'true'"
guarantee:
- "output && typeof output.id === 'number'"
continue_on_failure: false
on_fail:
retry: { max: 2, backoff: { mode: exponential, delay_ms: 1200 } }- Control‑plane (critical) — forEach parent that drives routing with a tighter loop budget:
routing:
max_loops: 8 # lower than default for safety on control‑plane flows
checks:
extract-items:
type: command
tags:
- critical
- internal
exec: "node -e \"console.log('[\\"a\\",\\"b\\",\\"c\\"]')\""
forEach: true
on_finish:
transitions:
- when: "any(outputs_history['validate'], x => x && x.ok === false)"
to: remediate
validate:
type: command
depends_on:
- extract-items
fanout: map
exec: node scripts/validate.js
remediate:
type: command
exec: node scripts/fix.js- Retries: bounded (e.g., max 3), exponential backoff with jitter; per‑scope attempt counters stored in memory.
- Routing loop budget:
routing.max_loops(default 10) per scope; exceeding emitsrouting/loop_budget_exceededand halts routing for that scope. - ForEach fan‑out:
- Per‑item retries are independent; partial success allowed; failed items are isolated.
- Aggregates reflect per‑item outcomes; reduce/map fan‑out is controlled via
fanout: 'reduce' | 'map'on dependents.
checks:
list:
type: command
exec: "node -e \"console.log('[\\"a\\",\\"b\\"]')\""
forEach: true
process:
type: command
depends_on: [list]
fanout: map
exec: node scripts/process-item.js
fail_if: "output.__failed === true"
on_fail:
retry: { max: 1, backoff: { mode: fixed, delay_ms: 500 } }- Every decision is committed to the journal (check id, scope, event, output, issues, timing).
- Export last run snapshot to JSON for post‑mortem or replay scaffolding:
const engine = new StateMachineExecutionEngine();
const result = await engine.executeChecks({ checks: ['build'], config });
await engine.saveSnapshotToFile('run-snapshot.json');
// const snap = await engine.loadSnapshotFromFile('run-snapshot.json');if: error → skip (if_condition).assume: violation → skip (or block dependents if critical).guarantee: violation → addcontract/guarantee_failedissue; route on_fail.fail_if: true → failure; retries only for transient classifications.- Loop budgets and retry caps prevent unbounded execution.
For additional examples, see:
- defaults/visor.yaml (fact validation transitions)
- tests/unit/routing-transitions-and-contracts.test.ts (transitions, assume/guarantee)
- docs/engine-state-machine-plan.md (state machine overview)
This section summarizes the full, NASA‑style approach we recommend. Items marked (optional) are enhancements you can phase in.
- Criticality (proposed field; tags remain a fallback)
criticality: external | internal | policy | info- or minimal boolean
critical: true|falseif you prefer simplicity.
- Contracts (implemented)
assume:preconditions (list of expressions)guarantee:postconditions (list of expressions)- (optional)
assume_mode: 'skip' | 'fail'— if set tofail, unmet assume marks failure and routeson_fail.
- Transitions (implemented)
on_success|on_fail|on_finish.transitions: [{ when, to, goto_event? }]withgoto_jsfallback.
- Retries
- (proposed)
retry_on: ['transient'] | ['transient','logical'](default: transient only).
- (proposed)
- Safety profiles (optional)
safety: strict | standard(global defaults for budgets/retries on critical branches).
- External / Control‑plane / Policy (critical)
- Require meaningful
assumeandguarantee. - Default
continue_on_failure: false. - Retries: bounded (max 2–3), transient faults only; no auto‑retry for logical violations.
- Lower per‑scope loop budget (e.g., 8 instead of 10).
- Suppress downstream mutating actions if guarantees/fail_if violate; remediate or escalate.
- Require meaningful
- Non‑critical
- Contracts recommended but not required.
continue_on_failure: trueallowed where safe.- Default loop budget (10), normal retry bounds.
- Evaluation order
if(plan‑time scheduling) → 2)assume(pre‑exec) → 3) provider → 4)guarantee+fail_if(post‑exec) → 5) transitions/goto.
- Determinism & safety
- Expressions run in a secure sandbox; no I/O/time randomness; short timeouts.
- ForEach isolation
fanout: mapexecutes per‑item; failures isolate; reduce aggregates once.- (optional) per‑item concurrency with default 1.
- Detect mutating providers (GitHub ops except read‑only, HTTP methods ≠ GET/HEAD, file writes).
- For critical steps: require idempotency or compensating actions; block side‑effects when contracts fail.
- Journal each decision (check, scope, expression, inputs, result, timestamps).
- Emit structured fault events:
fault.detected,fault.isolated,fault.recovery.*. - Metrics: retries, fault counts by class, loop budget hits.
- Export last run as JSON (implemented):
saveSnapshotToFile(). - (future) Debug‑only resume that reconstructs state from snapshot.
- Warn if a critical step lacks
assumeorguarantee. - Warn if mutating provider lacks criticality classification.
- Warn if
transitionsexist with tight loops disabled instrictsafety profile. - CLI
--safe-modeto disable mutating providers for dry‑runs.
- Unit
assumeskip vs guard‑step hard‑fail.guaranteeviolations add issues; no extra provider calls.- Transitions precedence over
goto_js; loop budget enforcement.
- Integration
- Critical external step blocks downstream side‑effects on contract failure.
- Control‑plane forEach parent with tight budget; verifies no loops past limit.
- Retry policy honors transient vs logical classification.
- YAML e2e
- Updated defaults remain green; include a strict safety profile scenario.
- All tests (unit/integration/YAML) green with critical/non‑critical mixes.
- Docs updated (this guide + engine plan); examples use block‑style YAML.
- Logger outputs timestamps; debug is gated.
- No dist/ committed; config validators warn on unsafe critical steps.
checks:
post-comment:
type: github
criticality: external
on:
- pr_opened
op: comment.create
assume:
- "isMember()"
- "env.DRY_RUN !== 'true'"
guarantee:
- "output && typeof output.id === 'number'"
continue_on_failure: false
on_fail:
retry: { max: 2, backoff: { mode: exponential, delay_ms: 1200 } }routing:
max_loops: 8
checks:
extract-items:
type: command
criticality: internal
exec: "node -e \"console.log('[\\"a\\",\\"b\\"]')\""
forEach: true
on_finish:
transitions:
- when: "any(outputs_history['validate'], v => v && v.ok === false)"
to: remediate
validate:
type: command
depends_on:
- extract-items
fanout: map
exec: node scripts/validate.js
remediate:
type: command
exec: node scripts/fix.jsUsing if (best: planning‑time decision):
checks:
summarize:
type: ai
on:
- pr_opened
- pr_updated
if: "filesCount > 0"
exec: node scripts/summarize.jsUsing assume (works but later in the lifecycle):
checks:
summarize:
type: ai
on:
- pr_opened
- pr_updated
assume:
- "filesCount > 0"
exec: node scripts/summarize.jsBoth skip the step; if prunes earlier, assume skips right before calling the provider.
Using guarantee (contract):
checks:
collect:
type: command
exec: "node collect.js" # produces { items: [...] }
guarantee:
- "output && Array.isArray(output.items)"
- "output.items.length > 0"
on_fail:
run:
- recomputeUsing fail_if (policy):
checks:
collect:
type: command
exec: "node collect.js"
fail_if: "!(output && Array.isArray(output.items) && output.items.length > 0)"
on_fail:
run:
- recomputeBoth mark the run as failed and route on_fail; use guarantee for design‑by‑contract semantics, fail_if for policy rules.
If you need an explicit failure instead of a skip for an unmet assume, use a guard:
checks:
prechecks:
type: command
exec: node scripts/check-tools.js # exit 1 when tools missing
fail_if: "output.exitCode !== 0"
analyze:
type: command
depends_on:
- prechecks
exec: node scripts/analyze.jsThis end‑to‑end example shows criticality, if, assume, guarantee, fail_if, and declarative transitions in one flow. It includes fan‑out (control‑plane), a policy gate, and an external step with contracts.
version: "1.0"
routing:
# Tighter budget recommended for control‑plane loops
max_loops: 8
checks:
# 1) Control‑plane fan‑out producer
extract-facts:
type: command
criticality: internal
on:
- issue_opened
- issue_comment
exec: "node -e \"console.log('[{""id"":1,""claim"":""A""},{""id"":2,""claim"":""B""}]')\""
forEach: true
# Postconditions: enforce shape and cap fan‑out size
guarantee:
- "Array.isArray(output)"
- "output.every(x => typeof x.id === 'number' && typeof x.claim === 'string')"
- "output.length <= 50"
# Route back for remediation when any validation failed
on_finish:
transitions:
- when: "any(outputs_history['validate-fact'], v => v && v.is_valid === false) && event.name === 'issue_opened'"
to: issue-assistant
- when: "any(outputs_history['validate-fact'], v => v && v.is_valid === false) && event.name === 'issue_comment'"
to: comment-assistant
# 2) Map fan‑out validator
validate-fact:
type: command
depends_on:
- extract-facts
fanout: map
exec: node scripts/validate-fact.js # -> { is_valid: boolean, errors?: number }
# declare policy failure
fail_if: "output && output.is_valid === false"
on_fail:
# Only retry transient provider errors (e.g., script crashed), not logical invalids
retry: { max: 1, backoff: { mode: exponential, delay_ms: 1000 } }
# 3) Control‑plane aggregator that computes overall validity
aggregate:
type: command
criticality: internal
depends_on:
- validate-fact
exec: node scripts/aggregate-validity.js # -> { all_valid: boolean }
guarantee:
- "output && typeof output.all_valid === 'boolean'"
on_success:
transitions:
- when: "output.all_valid === true"
to: permission-check
# 4) Policy gate (no external side‑effect but gates external actions)
permission-check:
type: command
criticality: policy
exec: node scripts/check-permissions.js # -> { allowed: boolean }
guarantee:
- "typeof output.allowed === 'boolean'"
# 5) External action — only runs when policy passes (belt and suspenders)
post-comment:
type: github
criticality: external
depends_on:
- permission-check
on:
- issue_opened
# Coarse plan‑time gate (cheap & early)
if: "outputs['permission-check'] && outputs['permission-check'].allowed === true"
# Final execution preflight to avoid side‑effects if context shifted
assume:
- "outputs['permission-check'] && outputs['permission-check'].allowed === true"
- "env.DRY_RUN !== 'true'"
op: comment.create
guarantee:
- "output && typeof output.id === 'number'"
continue_on_failure: false
# 6) Non‑critical compute — allowed to fail softly
summarize:
type: ai
criticality: info
on:
- issue_opened
continue_on_failure: true
fail_if: "(output.errors || []).length > 0"Highlights
- control‑plane steps (extract-facts, aggregate) carry
assume/guaranteeand drive transitions under a tight loop budget. - validate-fact uses
fail_iffor policy failure and a bounded retry only for transient provider errors. - permission-check (policy) gates external actions without itself mutating external systems.
- post-comment (external) uses both
if(early prune) andassume(preflight) plus aguaranteeafter posting. - summarize shows a non‑critical step with soft failure handling via
continue_on_failure: true.
The schema field has been unified to handle both template selection and output validation:
schema: <string>selects a layout/renderer (e.g.,'code-review','markdown') for template rendering.schema: <object>is a JSON Schema; the engine validatesoutputfor any provider (ai/command/script/http). Violations createcontract/schema_validation_failedissues and follow criticality rules.output_schemais deprecated; keep it only for backward compatibility. When bothschema(object) andoutput_schemaare present,schematakes precedence for validation.