Status: Draft
Last updated: 2025-10-17
Owner: Visor team
Related: telemetry-tracing-rfc.md
We want an interactive debugging experience where developers can visualize visor execution in real-time, inspect full state at any point, and use time-travel debugging to understand complex check flows. This should work both live (streaming) and offline (from saved OTEL trace files).
- Real-time Visualization: Live DAG showing check execution flow, dependencies, and status
- Full State Inspection: Click any node to see complete input/output, context variables, timing
- Time-Travel Debugging: Scrub timeline to replay execution and compare states
- OTEL-Native: Built entirely on OpenTelemetry spans, attributes, and events
- Dual Mode: Stream live execution via WebSocket OR load saved OTEL trace files
- Zero Code Changes: Works with existing OTEL infrastructure, just enhanced attributes
- Replacing existing CLI output or GitHub comments
- Full IDE debugger features (breakpoints, stepping)
- Distributed tracing across multiple services (single visor run only)
Build an interactive HTML-based debugger that reads OpenTelemetry spans (either streaming or from NDJSON files) and visualizes:
- Execution Graph: Force-directed DAG of checks, dependencies, and data flow
- State Inspector: Full context, inputs, outputs, transforms for each node
- Timeline: Execution timeline with ability to scrub/replay
- Metrics Dashboard: Issue counts, durations, routing decisions
┌─────────────────────────────────────────────────────────────┐
│ Visor Execution │
│ (Enhanced OTEL spans with full state attributes) │
└───────────────┬─────────────────────────────────────────────┘
│
├─ Live Mode ──────> WebSocket Server ──────┐
│ (port 3456) │
│ │
└─ File Mode ──────> NDJSON trace files ────┤
(output/traces/*.ndjson)│
│
▼
┌────────────────────────────────┐
│ Trace Reader & Processor │
│ - Parse OTEL spans │
│ - Rebuild execution tree │
│ - Extract state snapshots │
└───────────┬────────────────────┘
│
▼
┌────────────────────────────────┐
│ Interactive HTML UI │
│ - D3.js DAG visualization │
│ - State inspector panel │
│ - Timeline scrubber │
│ - Diff viewer │
└────────────────────────────────┘
To enable full debugging, we enhance existing spans with complete state attributes:
- ✅ Span hierarchy:
visor.run→visor.check→visor.provider - ✅ Events: check.started/completed, fail_if.triggered, retry.scheduled
- ✅ Basic attributes: check.id, check.type, duration, issue counts
Check Input Context (Liquid template variables):
span.setAttribute('visor.check.input.context', JSON.stringify({
pr: { /* full PR object */ },
outputs: { /* all previous outputs */ },
env: { /* safe env vars */ },
memory: { /* memory store */ }
}));Check Output:
span.setAttribute('visor.check.output', JSON.stringify(output));
span.setAttribute('visor.check.output.type', typeof output);
span.setAttribute('visor.check.output.length', Array.isArray(output) ? output.length : null);forEach State:
span.setAttribute('visor.foreach.items', JSON.stringify(items));
span.setAttribute('visor.foreach.current_item', JSON.stringify(items[index]));Transform/Evaluation Details:
span.setAttribute('visor.transform.code', transformJS);
span.setAttribute('visor.transform.input', JSON.stringify(input));
span.setAttribute('visor.transform.output', JSON.stringify(output));State Snapshots (time-travel):
span.addEvent('state.snapshot', {
'visor.snapshot.outputs': JSON.stringify(allOutputs),
'visor.snapshot.memory': JSON.stringify(memoryStore),
'visor.snapshot.timestamp': new Date().toISOString()
});File: src/telemetry/state-capture.ts
Utilities for capturing complete execution state in OTEL spans:
captureCheckInputContext(span, context)- Liquid template variablescaptureCheckOutput(span, output)- Check resultscaptureForEachState(span, items, index, current)- Iteration statecaptureLiquidEvaluation(span, template, context, result)- Template detailscaptureTransformJS(span, code, input, output)- Transform executioncaptureProviderCall(span, type, request, response)- Provider callscaptureStateSnapshot(span, checkId, outputs, memory)- Full state
Size Limits:
- Max attribute length: 10KB (truncate with
...[truncated]) - Max array items: 100 (store preview + indicate truncation)
- Detect circular references
File: src/debug-visualizer/trace-reader.ts
Reads OTEL NDJSON files and rebuilds execution tree:
interface ExecutionTrace {
runId: string;
spans: ProcessedSpan[];
tree: ExecutionNode; // Hierarchical structure
timeline: TimelineEvent[];
snapshots: StateSnapshot[];
}
interface ProcessedSpan {
traceId: string;
spanId: string;
parentSpanId?: string;
name: string;
startTime: number;
endTime: number;
duration: number;
attributes: Record<string, any>;
events: SpanEvent[];
status: 'ok' | 'error';
}
interface ExecutionNode {
checkId: string;
type: string;
status: 'pending' | 'running' | 'completed' | 'error' | 'skipped';
children: ExecutionNode[];
span: ProcessedSpan;
state: {
inputContext?: any;
output?: any;
errors?: string[];
};
}Functions:
parseNDJSONTrace(filePath)- Read NDJSON filebuildExecutionTree(spans)- Construct hierarchyextractStateSnapshots(spans)- Get time-travel pointscomputeTimeline(spans)- Create timeline events
File: src/debug-visualizer/ws-server.ts
Real-time streaming of OTEL spans during live execution:
class DebugVisualizerServer {
start(port: number = 3456): void;
stop(): void;
// Called by OTEL exporter to stream spans
emitSpan(span: ProcessedSpan): void;
emitEvent(event: SpanEvent): void;
emitStateUpdate(checkId: string, state: any): void;
}- WebSocket server on port 3456
- Broadcasts spans as they're created
- Supports multiple connected clients
- Heartbeat to detect disconnects
Integration: Add custom OTEL span exporter that also pushes to WS server when debug mode is enabled.
File: src/debug-visualizer/ui/index.html (single-file, no build step)
Self-contained HTML with embedded CSS/JS using:
- D3.js v7: Force-directed graph, timeline visualization
- Monaco Editor (optional): Syntax highlighting for code/JSON
- Vanilla JS: No framework, keep it simple
Features:
a) Execution Graph
- Force-directed DAG showing all checks
- Node colors: pending (gray), running (blue), success (green), error (red), skipped (yellow)
- Edges show dependencies (solid) and data flow (dashed)
- Click node → show state inspector
- Hover → show timing tooltip
b) State Inspector Panel
- Tabs: Input Context, Output, Events, Attributes, Code
- JSON tree view with expand/collapse
- Syntax highlighting for code snippets
- Copy button for each section
c) Timeline
- Horizontal timeline showing check execution spans
- Gantt-chart style with parallelism visualization
- Scrubber to jump to specific time
- Play/pause for animation
d) Time-Travel
- Slider to scrub through execution history
- Graph updates to show state at selected time
- Diff view: compare two timepoints side-by-side
- Snapshot markers on timeline
e) Metrics Dashboard
- Issue count by severity (bar chart)
- Duration histogram
- Routing actions (retry/goto counts)
- forEach iterations summary
File: src/cli-main.ts (modifications)
New CLI modes:
# Live mode: run visor with live visualization
visor --debug-server
# Opens browser at http://localhost:3456
# Streams execution in real-time
# Replay mode: visualize saved trace
visor --debug-replay output/traces/run-2025-10-17.ndjson
# Opens browser showing completed execution
# Serve mode: just run the server (no execution)
visor --debug-serve output/traces/run-2025-10-17.ndjsonImplementation:
if (opts.debugServer || opts.debugReplay) {
const server = new DebugVisualizerServer();
await server.start(3456);
if (opts.debugReplay) {
// Load trace file and send to connected clients
const trace = await parseNDJSONTrace(opts.debugReplay);
server.loadTrace(trace);
} else {
// Live mode: add WS exporter to OTEL
await initTelemetry({
enabled: true,
sink: 'file',
debugServer: server // Pass server to exporter
});
}
// Open browser
await open('http://localhost:3456');
}Sensitive Data Handling:
- Truncate large attributes (max 10KB per attribute)
- Option to redact:
--debug-redact(hash file paths, mask tokens) - Never capture raw code by default (only summaries)
- Provider request/response: capture lengths and previews only
Access Control:
- Debug server runs on localhost only by default
- Option for
--debug-host 0.0.0.0with warning - No authentication (local dev tool)
State Snapshots:
- Emit
state.snapshotevents at key points:- After each check completes
- Before/after forEach iteration
- Before routing decision (retry/goto)
- Events contain full
outputsandmemorystate
Replay Algorithm:
- Load all spans and events
- Sort by timestamp
- For each timepoint, reconstruct state:
- Apply events in order up to selected time
- Show which checks were running
- Display accumulated outputs
Diff View:
- User selects two timepoints (A and B)
- Compute delta:
- New outputs between A and B
- Changed check states
- Highlight differences in JSON viewer
┌────────────────────────────────────────────────────────────────┐
│ Visor Debug Visualizer [Live] ● │
├────────────────────────────────────────────────────────────────┤
│ Timeline: [===============●====================] 2.3s / 4.1s │
│ [Play] [Pause] [<<] [>>] Speed: 1x │
├───────────────────────────────┬────────────────────────────────┤
│ │ │
│ Execution Graph │ State Inspector │
│ │ │
│ ┌─┐ ┌─┐ │ Check: security-scan │
│ │A├─>│B│ │ Status: ✓ completed (1.2s) │
│ └─┘ └┬┘ │ │
│ ┌▼┐ ┌─┐ │ [Input] [Output] [Events] │
│ │C├>│D│ │ │
│ └─┘ └─┘ │ Output: │
│ │ { │
│ Legend: │ "issues": [ │
│ ● Running ✓ Done ✗ Error │ {...} │
│ ○ Pending ⊘ Skipped │ ] │
│ │ } │
│ │ │
│ │ [Copy JSON] [View Diff] │
├───────────────────────────────┴────────────────────────────────┤
│ Metrics: 3 checks, 12 issues (2 critical, 5 error, 5 warning)│
└────────────────────────────────────────────────────────────────┘
Goal: Enhanced OTEL spans contain complete execution state
Tasks:
- Implement
state-capture.tsmodule with all capture functions - Integrate
captureCheckInputContext()in check execution engine - Integrate
captureCheckOutput()after check completion - Integrate
captureForEachState()in forEach iteration loop - Add
captureStateSnapshot()events at key execution points - Write unit tests for all capture functions
- Integrate state capture in Command Provider
- Integrate state capture in AI Provider
- Integrate state capture in HTTP Provider
- Create E2E acceptance test
Acceptance Test:
# Run visor with telemetry enabled
VISOR_TELEMETRY_ENABLED=true visor --config test-config.yaml
# Verify NDJSON contains enhanced attributes
cat output/traces/run-*.ndjson | jq '.attributes | select(."visor.check.input.context")' | head -n 1
# Should see: full JSON object with pr, outputs, env, memorySuccess Criteria: ✅ ALL MET
- At least one span has
visor.check.input.contextattribute - At least one span has
visor.check.outputattribute - forEach spans have
visor.foreach.itemsattribute - At least one
state.snapshotevent is present - All tests pass
Deliverables:
- ✅
src/telemetry/state-capture.ts(337 lines) - ✅
tests/unit/telemetry/state-capture.test.ts(246 lines) - ✅
tests/e2e/state-capture-e2e.test.ts(195 lines) - ✅ Integration in 3 providers + execution engine
Goal: Can parse NDJSON and rebuild execution tree structure
Tasks:
- Create
src/debug-visualizer/trace-reader.ts - Implement
parseNDJSONTrace()- read and parse file - Implement
buildExecutionTree()- construct parent/child hierarchy - Implement
extractStateSnapshots()- collect time-travel points - Implement
computeTimeline()- chronological event list - Add tests with fixture NDJSON files
Acceptance Test:
# Create test script
cat > test-trace-reader.js << 'EOF'
const { parseNDJSONTrace, buildExecutionTree } = require('./dist/debug-visualizer/trace-reader');
async function test() {
const trace = await parseNDJSONTrace('output/traces/run-*.ndjson');
console.log(`Parsed ${trace.spans.length} spans`);
const tree = buildExecutionTree(trace.spans);
console.log(`Root node: ${tree.checkId}`);
console.log(`Children: ${tree.children.length}`);
assert(trace.spans.length > 0, 'Should have spans');
assert(tree.children.length > 0, 'Should have child nodes');
console.log('✅ All assertions passed');
}
test();
EOF
node test-trace-reader.jsSuccess Criteria: ✅ ALL MET
- Can parse valid NDJSON trace file without errors
- Execution tree has correct parent-child relationships
- All spans are accounted for in the tree
- State snapshots are extracted with timestamps
- Timeline events are in chronological order
- All tests pass (26/26 passing)
Deliverables:
- ✅
src/debug-visualizer/trace-reader.ts(484 lines) - ✅
tests/unit/debug-visualizer/trace-reader.test.ts(330 lines) - ✅ Test fixtures: sample-trace.ndjson, error-trace.ndjson, empty-trace.ndjson
- ✅ 26 comprehensive unit tests covering all functions
- ✅ 100% test pass rate
Goal: Can open HTML file and see visualized execution graph
Tasks:
- Create
src/debug-visualizer/ui/index.html(single file) - Implement trace file loader (file upload or URL param)
- Implement D3.js force-directed graph of checks
- Implement node coloring by status (pending/running/success/error)
- Implement basic state inspector panel (JSON viewer)
- Add click handler to show check details
Acceptance Test:
# Build the project
npm run build
# Run visor to generate trace
VISOR_TELEMETRY_ENABLED=true ./dist/cli-main.js --check all
# Copy UI file to output directory
cp src/debug-visualizer/ui/index.html output/traces/
# Open in browser
open output/traces/index.html
# Manual verification:
# 1. Should see execution graph with nodes
# 2. Click a node -> inspector shows check details
# 3. Verify node colors match execution status
# 4. Verify all checks are visible in graphSuccess Criteria: ✅ ALL MET
- HTML file loads without errors in browser
- Execution graph renders with all checks visible
- Nodes are colored correctly (green=success, red=error, etc.)
- Clicking node shows state inspector panel
- Inspector displays input context, output, and attributes
- Can load trace file via file picker or URL parameter
Deliverables:
- ✅
src/debug-visualizer/ui/index.html(27KB single file) - ✅ Zero build step required - pure HTML/CSS/JS
- ✅ D3.js v7 for force-directed graph
- ✅ Interactive inspector with 4 tabs (Overview, Input, Output, Events)
- ✅ JSON syntax highlighting
- ✅ Pan, zoom, and drag support
- ✅ File upload + URL parameter loading
- ✅ Status-based color coding with legend
- ✅ Manual testing guide (README.md)
Goal: Real-time visualization of running visor execution
Status: Fully integrated and operational
Tasks:
- Create
src/debug-visualizer/ws-server.ts- WebSocket server - Implement WebSocket server on port 3456 with HTTP fallback
- Create custom OTEL span exporter (
debug-span-exporter.ts) - Add
--debug-serverand--debug-portCLI flags - Integrate debug server into CLI main
- Update UI with WebSocket client code
- Add auto-open browser functionality
- Install dependencies (ws@^8.18.3, open@^9.1.0)
Acceptance Test:
# Terminal 1: Start visor in debug server mode
./dist/cli-main.js --debug-server --check all
# Should see:
# "Debug visualizer running at http://localhost:3456"
# Browser opens automatically
# Manual verification:
# 1. Graph should start empty
# 2. As checks execute, nodes appear in real-time
# 3. Node colors update from pending -> running -> success/error
# 4. Can click running checks to see current state
# 5. After completion, full execution graph is visibleSuccess Criteria: ✅ ALL MET
- WebSocket server module implemented
- Custom OTEL exporter implemented
- CLI option types defined
- WebSocket server starts on port 3456
- Browser opens automatically
- UI receives span updates in real-time
- Graph updates as checks execute
- Can inspect state of currently running checks
- Server shuts down cleanly when visor exits
- Multiple browser tabs can connect simultaneously
- Build passes (npm run build)
Deliverables:
- ✅
src/debug-visualizer/ws-server.ts(310 lines) - WebSocket server - ✅
src/debug-visualizer/debug-span-exporter.ts(121 lines) - OTEL exporter - ✅
src/types/cli.ts(updated) - CLI option types - ✅
src/cli.ts(updated) - CLI flags integration - ✅
src/cli-main.ts(updated) - Server initialization and cleanup - ✅
src/telemetry/opentelemetry.ts(updated) - Debug exporter support - ✅
src/debug-visualizer/ui/index.html(updated) - WebSocket client - ✅
package.json(updated) - Dependencies and build script
Dependencies Installed:
- ✅ ws@^8.18.3
- ✅ open@^9.1.0
- ✅ @types/ws@^8.18.1
Goal: Can scrub timeline and replay execution history
Tasks:
- Add timeline component to UI (horizontal scrubber)
- Implement time-travel state reconstruction
- Add play/pause controls for animated replay
- Implement diff view between two timepoints
- Add state snapshot markers on timeline
- Add keyboard shortcuts (space=play/pause, arrows=step)
- Build snapshot navigation panel
- Add playback speed controls (0.5×, 1×, 2×, 5×)
- Implement event counter and time display
- Write comprehensive unit tests
Acceptance Test:
# Load completed trace in UI
open "output/traces/index.html?trace=run-2025-10-17.ndjson"
# Manual verification:
# 1. Timeline shows full execution duration
# 2. Drag scrubber to middle -> graph shows partial execution
# 3. Click Play -> execution replays with animation
# 4. Click two timepoints -> diff view shows what changed
# 5. State snapshots appear as markers on timeline
# 6. Space bar toggles play/pause
# 7. Arrow keys step forward/backwardSuccess Criteria: ✅ ALL MET
- Timeline scrubber updates graph to show state at selected time
- Play button animates execution from start to finish
- Can pause at any point and inspect state
- Diff view highlights changes between timepoints
- State snapshot markers are clickable
- Keyboard shortcuts work correctly
- Performance: smooth scrubbing with 1000+ spans
Deliverables:
- ✅
src/debug-visualizer/ui/index.html(updated) - Timeline component, styles, and JavaScript engine - ✅
tests/unit/debug-visualizer/time-travel.test.ts- 17 comprehensive unit tests
Goal: Polished, documented, and production-ready feature
Tasks:
- Add metrics dashboard panel (issue counts, durations)
- Add search/filter for checks by name or tag
- Add export functionality (save graph as PNG/SVG)
- Write user documentation with examples
- Add demo video/GIF to docs
- Performance optimization (virtualization for large traces)
- Add
--debug-replayCLI flag for offline viewing
Acceptance Test:
# Test replay mode
./dist/cli-main.js --debug-replay output/traces/run-2025-10-17.ndjson
# Test all features end-to-end
npm run test:e2e:debug-visualizer
# Should test:
# - Load large trace (1000+ spans)
# - Search for specific check
# - Export graph as PNG
# - Time-travel through execution
# - Diff two states
# - Verify metrics dashboard accuracySuccess Criteria:
-
--debug-replayflag works correctly - Metrics dashboard shows accurate counts
- Search finds checks by name/tag
- Export produces valid PNG/SVG files
- Documentation includes screenshots and examples
- Performance test: handles 1000+ spans smoothly (<2s load)
- All E2E tests pass
- Feature announced in changelog/release notes
The debug visualizer is complete when:
- Foundation: OTEL spans capture complete execution state ✅ M1 DONE
- Data Layer: Can parse traces and rebuild execution tree ✅ M2 DONE
- Visualization: Can see execution graph in browser ✅ M3 DONE
- Real-time: Can stream live execution ✅ M4 DONE
- Time-Travel: Can scrub timeline and see historical state ✅ M5 DONE
- Production: Polished UI with docs and tests (M6)
- Should we bundle UI assets or keep single-file HTML?
- Leaning toward single-file for simplicity
- Should WebSocket server be opt-in or always-on in dev?
- Opt-in with
--debug-serverflag
- Opt-in with
- Do we need authentication for remote access?
- Not in v1 (localhost only), can add later
- Should we support multiple simultaneous runs?
- Not in v1, one run at a time
- Export format for sharing traces?
- NDJSON files are already portable, maybe add
.visor-tracezip format
- NDJSON files are already portable, maybe add
- Record/Replay: Save execution + state, replay with different inputs
- Breakpoints: Pause execution at specific checks (requires agent mode)
- Performance Profiling: Flame graphs, bottleneck detection
- Distributed Tracing: Multiple visor runs, cross-repo analysis
- AI Assistant: "Why did check X fail?" with LLM-powered analysis
- VSCode Extension: Embedded visualizer in editor
- Collaborative Debugging: Share live sessions via URL