Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints#1999
Conversation
…tive mode is enabled Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
WalkthroughThis PR adds interactive human-in-the-loop (HITL) mode support to FastAPI routes by introducing a conditional response model helper that optionally includes ExecutionAcceptedResponse (HTTP 202) when interactive mode is enabled, applies this pattern to generate and chat completions endpoints, and validates the behavior with integration tests. ChangesInteractive HITL Mode Support
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…ugfix/hitl-v1workflow-response-validation
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py (1)
184-189: ⚡ Quick winGate the documented 202 response on
enable_interactive.The
response_modelis correctly conditioned via_interactive_response_model(..., enable_interactive), but theresponsesmap unconditionally documents202: ExecutionAcceptedResponse. For non-interactive SINGLE routes (e.g. the legacy path registered withenable_interactive=Falseat Lines 303-309), the handler never returns 202, so the OpenAPI schema advertises a response that can't occur, which misleads generated clients.♻️ Proposed fix to keep OpenAPI in sync with handler behavior
- response_model=_interactive_response_model(response_type, enable_interactive), - responses={ - 500: RESPONSE_500, 202: { - "model": ExecutionAcceptedResponse - } - }, + response_model=_interactive_response_model(response_type, enable_interactive), + responses={ + 500: RESPONSE_500, + **({ + 202: { + "model": ExecutionAcceptedResponse + } + } if enable_interactive else {}), + },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py` around lines 184 - 189, The OpenAPI responses map unconditionally includes 202: ExecutionAcceptedResponse while the route's response_model is already gated by _interactive_response_model(..., enable_interactive); update the route registration to build the responses dict dynamically: start with responses = {500: RESPONSE_500} and only add responses[202] = {"model": ExecutionAcceptedResponse} when enable_interactive is True, then pass that responses variable into the route decorator so the documented 202 is present only when enable_interactive is enabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`:
- Around line 184-189: The OpenAPI responses map unconditionally includes 202:
ExecutionAcceptedResponse while the route's response_model is already gated by
_interactive_response_model(..., enable_interactive); update the route
registration to build the responses dict dynamically: start with responses =
{500: RESPONSE_500} and only add responses[202] = {"model":
ExecutionAcceptedResponse} when enable_interactive is True, then pass that
responses variable into the route decorator so the documented 202 is present
only when enable_interactive is enabled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2e882302-2b4c-4a9d-88ea-c5bdf915ccc8
📒 Files selected for processing (4)
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/common_utils.pypackages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.pypackages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/v1_chat_completions.pypackages/nvidia_nat_core/tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py
|
/merge |
Description
Summary
Fixes FastAPI returning HTTP 500 instead of HTTP 202 when a workflow pauses for human input (HITL) on the
/v1/workflowand/v1/chat/completionsnon-streaming endpoints. Whenenable_interactive=True, the routeresponse_modeldid not includeExecutionAcceptedResponse, causing FastAPI to raise aResponseValidationErroron any HITL pause.Changes
_interactive_response_modelhelper (nvidia_nat_core)Added
_interactive_response_modeltocommon_utils.py. Whenenable_interactive=True, expands the routeresponse_modelto a union that includesExecutionAcceptedResponse, allowing FastAPI to validate and serialize the HTTP 202 HITL response correctly./v1/workflowendpoint (generate.py)Updated
add_generate_routeto call_interactive_response_modelfor theSINGLEendpoint type and added202: {"model": ExecutionAcceptedResponse}to the OpenAPIresponsesdict./v1/chat/completionsendpoint (v1_chat_completions.py)Updated
add_v1_chat_completions_routeto call_interactive_response_modelinstead of a hardcoded union, keeping the pattern consistent with the generate route.Tests
Added two mock-based regression tests to
test_fastapi_front_end_plugin.py:test_workflow_single_hitl_pause_returns_202— asserts/v1/workflowreturns HTTP 202 and a fully-populatedExecutionAcceptedInteractionbody (execution ID, interaction ID, prompt, status URL, response URL) when a workflow pauses for human input.test_v1_chat_completions_hitl_pause_returns_202— same contract for/v1/chat/completions.By Submitting this PR I confirm: