Skip to content

Treat MCP management-tool registration race as transient, not Structural (#458)#459

Merged
rockfordlhotka merged 1 commit into
mainfrom
rockfordlhotka/458-mcp-readiness-race
Jun 2, 2026
Merged

Treat MCP management-tool registration race as transient, not Structural (#458)#459
rockfordlhotka merged 1 commit into
mainfrom
rockfordlhotka/458-mcp-readiness-race

Conversation

@rockfordlhotka

Copy link
Copy Markdown
Member

Fixes #458.

Problem

The 6 MCP management tools (mcp_invoke_tool, mcp_list_services, etc.) are registered lazily by McpServersIndexedHandler on the first McpServersIndexed message from the MCP bridge. A wisp Direct MCP step always routes to mcp_invoke_tool (GatewayRouter.RouteMcp); if it fires in the startup/reconnect window before that message, WispExecutor resolved a null executor and hard-failed with FailureCategory.Structural — the "learnable authoring bug" category. That (a) failed the step outright instead of waiting, and (b) polluted the structural-failure signal used for wisp authoring-quality analysis and dream-time learning with what is really a transient readiness race (~4 such failures in a recent 14-day wisp-executions.jsonl window).

Fix

Contained to RockBot.Wisp, combining both fixes the issue proposes:

  • Wait for readiness (root timing): WispExecutor now resolves the executor via ResolveExecutorWithReadinessAsync. In steady state it's a single lookup (zero added latency). If a management tool is absent, it polls the registry (~100 ms interval, Stopwatch deadline) up to WispOptions.McpReadinessWait (default 5s) and proceeds as soon as the tool registers — so a step in the window succeeds.
  • Reclassify (floor): if the tool never appears, the step fails as FailureCategory.External (transient/retryable) instead of Structural, so it no longer pollutes authoring metrics. Genuinely unknown / non-MCP tools still fail Structural immediately (no wait).

WispOptions.McpReadinessWait = TimeSpan.Zero disables the wait (resolve once, then reclassify).

Out of scope (follow-ups)

  • WispExecutor.ClassifyToolError maps post-invoke runtime errors containing "not registered" to Structural — same class of issue, distinct path; left untouched to avoid reclassifying genuine authoring typos.
  • CapabilityClaimVerifier / RepairTicketVerifier already handle the race (Uncertain, uncached). McpStepValidator is unaffected (skips validation when no schema is found).

Testing

  • dotnet test RockBot.slnx --filter "ClassName~WispExecutorTests"25/25 passing (3 new): management tool absent → External; tool registers during the wait → step succeeds; non-MCP tool stays Structural immediately.
  • dotnet build RockBot.slnx → 0 errors.

Version bumped to 0.12.26.

🤖 Generated with Claude Code

…ral (#458)

mcp_invoke_tool and the other MCP management tools are registered lazily on
the first McpServersIndexed message from the bridge. A wisp Direct MCP step
firing in the startup/reconnect window found the tool unregistered and
hard-failed with FailureCategory.Structural, polluting the wisp
authoring-quality signal with what is really a transient readiness race.

WispExecutor now briefly waits (WispOptions.McpReadinessWait, default 5s) for
the management tools to register before invoking, so a step in the window
succeeds once the bridge index arrives. If they never register, the step
fails as External (transient/retryable) instead of Structural. Genuinely
unknown / non-MCP tools still fail Structural immediately.

Bump version to 0.12.26.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rockfordlhotka rockfordlhotka merged commit 783c141 into main Jun 2, 2026
2 checks passed
@rockfordlhotka rockfordlhotka deleted the rockfordlhotka/458-mcp-readiness-race branch June 2, 2026 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race: wisp/scheduled MCP step before first McpServersIndexed reports "mcp_invoke_tool is not registered" (misclassified Structural)

1 participant