Skip to content

Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints#1999

Merged
rapids-bot[bot] merged 2 commits into
NVIDIA:developfrom
ericevans-nv:bugfix/hitl-v1workflow-response-validation
Jun 1, 2026
Merged

Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints#1999
rapids-bot[bot] merged 2 commits into
NVIDIA:developfrom
ericevans-nv:bugfix/hitl-v1workflow-response-validation

Conversation

@ericevans-nv
Copy link
Copy Markdown
Contributor

@ericevans-nv ericevans-nv commented Jun 1, 2026

Description

Summary
Fixes FastAPI returning HTTP 500 instead of HTTP 202 when a workflow pauses for human input (HITL) on the /v1/workflow and /v1/chat/completions non-streaming endpoints. When enable_interactive=True, the route response_model did not include ExecutionAcceptedResponse, causing FastAPI to raise a ResponseValidationError on any HITL pause.

Changes

  1. _interactive_response_model helper (nvidia_nat_core)
    Added _interactive_response_model to common_utils.py. When enable_interactive=True, expands the route response_model to a union that includes ExecutionAcceptedResponse, allowing FastAPI to validate and serialize the HTTP 202 HITL response correctly.

  2. /v1/workflow endpoint (generate.py)
    Updated add_generate_route to call _interactive_response_model for the SINGLE endpoint type and added 202: {"model": ExecutionAcceptedResponse} to the OpenAPI responses dict.

  3. /v1/chat/completions endpoint (v1_chat_completions.py)
    Updated add_v1_chat_completions_route to call _interactive_response_model instead of a hardcoded union, keeping the pattern consistent with the generate route.

Tests
Added two mock-based regression tests to test_fastapi_front_end_plugin.py:

  • test_workflow_single_hitl_pause_returns_202 — asserts /v1/workflow returns HTTP 202 and a fully-populated ExecutionAcceptedInteraction body (execution ID, interaction ID, prompt, status URL, response URL) when a workflow pauses for human input.
  • test_v1_chat_completions_hitl_pause_returns_202 — same contract for /v1/chat/completions.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

…tive mode is enabled

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
@ericevans-nv ericevans-nv requested a review from a team as a code owner June 1, 2026 18:56
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Walkthrough

This PR adds interactive human-in-the-loop (HITL) mode support to FastAPI routes by introducing a conditional response model helper that optionally includes ExecutionAcceptedResponse (HTTP 202) when interactive mode is enabled, applies this pattern to generate and chat completions endpoints, and validates the behavior with integration tests.

Changes

Interactive HITL Mode Support

Layer / File(s) Summary
Interactive response model helper
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/common_utils.py
Adds ExecutionAcceptedResponse import and defines _interactive_response_model(response_type, enable_interactive) function that returns the original response type when interactive mode is disabled, or a union type including ExecutionAcceptedResponse when enabled.
Generate endpoint interactive mode
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py
Imports interactive response modeling utilities and updates the SINGLE endpoint route registration to use _interactive_response_model for conditional response schema generation and documents HTTP 202 with ExecutionAcceptedResponse in OpenAPI metadata.
Chat completions endpoint interactive mode
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/v1_chat_completions.py
Imports interactive response typing helpers and updates the v1 chat completions route to compute response model via _interactive_response_model, extending OpenAPI responses metadata to include HTTP 202 with ExecutionAcceptedResponse.
HITL behavior validation tests
packages/nvidia_nat_core/tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py
Adds test infrastructure for mocking and in-process HTTP client testing, imports HITL-specific models and route builders, and validates that both /v1/workflow and /v1/chat/completions endpoints return HTTP 202 with ExecutionAcceptedInteraction bodies when interactive mode is enabled and an interaction is required.

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: enabling ExecutionAcceptedResponse for interactive endpoints. However, the title provided (75 chars) slightly exceeds the recommended max of ~72 characters.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericevans-nv ericevans-nv self-assigned this Jun 1, 2026
@ericevans-nv ericevans-nv added non-breaking Non-breaking change bug Something isn't working labels Jun 1, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py (1)

184-189: ⚡ Quick win

Gate the documented 202 response on enable_interactive.

The response_model is correctly conditioned via _interactive_response_model(..., enable_interactive), but the responses map unconditionally documents 202: ExecutionAcceptedResponse. For non-interactive SINGLE routes (e.g. the legacy path registered with enable_interactive=False at Lines 303-309), the handler never returns 202, so the OpenAPI schema advertises a response that can't occur, which misleads generated clients.

♻️ Proposed fix to keep OpenAPI in sync with handler behavior
-                response_model=_interactive_response_model(response_type, enable_interactive),
-                responses={
-                    500: RESPONSE_500, 202: {
-                        "model": ExecutionAcceptedResponse
-                    }
-                },
+                response_model=_interactive_response_model(response_type, enable_interactive),
+                responses={
+                    500: RESPONSE_500,
+                    **({
+                        202: {
+                            "model": ExecutionAcceptedResponse
+                        }
+                    } if enable_interactive else {}),
+                },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`
around lines 184 - 189, The OpenAPI responses map unconditionally includes 202:
ExecutionAcceptedResponse while the route's response_model is already gated by
_interactive_response_model(..., enable_interactive); update the route
registration to build the responses dict dynamically: start with responses =
{500: RESPONSE_500} and only add responses[202] = {"model":
ExecutionAcceptedResponse} when enable_interactive is True, then pass that
responses variable into the route decorator so the documented 202 is present
only when enable_interactive is enabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`:
- Around line 184-189: The OpenAPI responses map unconditionally includes 202:
ExecutionAcceptedResponse while the route's response_model is already gated by
_interactive_response_model(..., enable_interactive); update the route
registration to build the responses dict dynamically: start with responses =
{500: RESPONSE_500} and only add responses[202] = {"model":
ExecutionAcceptedResponse} when enable_interactive is True, then pass that
responses variable into the route decorator so the documented 202 is present
only when enable_interactive is enabled.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2e882302-2b4c-4a9d-88ea-c5bdf915ccc8

📥 Commits

Reviewing files that changed from the base of the PR and between e4c5f08 and 77a613d.

📒 Files selected for processing (4)
  • packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/common_utils.py
  • packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py
  • packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/v1_chat_completions.py
  • packages/nvidia_nat_core/tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py

@ericevans-nv ericevans-nv changed the title fix: include ExecutionAcceptedResponse in response_model when interactive mode is enabled fix: non-streaming endpoints return HTTP 500 instead of 202 on HITL pause Jun 1, 2026
@ericevans-nv ericevans-nv changed the title fix: non-streaming endpoints return HTTP 500 instead of 202 on HITL pause Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints Jun 1, 2026
@ericevans-nv
Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot Bot merged commit 96e0dda into NVIDIA:develop Jun 1, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants