Treat MCP management-tool registration race as transient, not Structural (#458)#459
Merged
Merged
Conversation
…ral (#458) mcp_invoke_tool and the other MCP management tools are registered lazily on the first McpServersIndexed message from the bridge. A wisp Direct MCP step firing in the startup/reconnect window found the tool unregistered and hard-failed with FailureCategory.Structural, polluting the wisp authoring-quality signal with what is really a transient readiness race. WispExecutor now briefly waits (WispOptions.McpReadinessWait, default 5s) for the management tools to register before invoking, so a step in the window succeeds once the bridge index arrives. If they never register, the step fails as External (transient/retryable) instead of Structural. Genuinely unknown / non-MCP tools still fail Structural immediately. Bump version to 0.12.26. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #458.
Problem
The 6 MCP management tools (
mcp_invoke_tool,mcp_list_services, etc.) are registered lazily byMcpServersIndexedHandleron the firstMcpServersIndexedmessage from the MCP bridge. A wisp Direct MCP step always routes tomcp_invoke_tool(GatewayRouter.RouteMcp); if it fires in the startup/reconnect window before that message,WispExecutorresolved a null executor and hard-failed withFailureCategory.Structural— the "learnable authoring bug" category. That (a) failed the step outright instead of waiting, and (b) polluted the structural-failure signal used for wisp authoring-quality analysis and dream-time learning with what is really a transient readiness race (~4 such failures in a recent 14-daywisp-executions.jsonlwindow).Fix
Contained to
RockBot.Wisp, combining both fixes the issue proposes:WispExecutornow resolves the executor viaResolveExecutorWithReadinessAsync. In steady state it's a single lookup (zero added latency). If a management tool is absent, it polls the registry (~100 ms interval,Stopwatchdeadline) up toWispOptions.McpReadinessWait(default 5s) and proceeds as soon as the tool registers — so a step in the window succeeds.FailureCategory.External(transient/retryable) instead ofStructural, so it no longer pollutes authoring metrics. Genuinely unknown / non-MCP tools still failStructuralimmediately (no wait).WispOptions.McpReadinessWait = TimeSpan.Zerodisables the wait (resolve once, then reclassify).Out of scope (follow-ups)
WispExecutor.ClassifyToolErrormaps post-invoke runtime errors containing"not registered"toStructural— same class of issue, distinct path; left untouched to avoid reclassifying genuine authoring typos.CapabilityClaimVerifier/RepairTicketVerifieralready handle the race (Uncertain, uncached).McpStepValidatoris unaffected (skips validation when no schema is found).Testing
dotnet test RockBot.slnx --filter "ClassName~WispExecutorTests"→ 25/25 passing (3 new): management tool absent →External; tool registers during the wait → step succeeds; non-MCP tool staysStructuralimmediately.dotnet build RockBot.slnx→ 0 errors.Version bumped to 0.12.26.
🤖 Generated with Claude Code