Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints by ericevans-nv · Pull Request #1999 · NVIDIA/NeMo-Agent-Toolkit

ericevans-nv · 2026-06-01T18:56:20Z

Description

Summary
Fixes FastAPI returning HTTP 500 instead of HTTP 202 when a workflow pauses for human input (HITL) on the /v1/workflow and /v1/chat/completions non-streaming endpoints. When enable_interactive=True, the route response_model did not include ExecutionAcceptedResponse, causing FastAPI to raise a ResponseValidationError on any HITL pause.

Changes

_interactive_response_model helper (nvidia_nat_core)
Added _interactive_response_model to common_utils.py. When enable_interactive=True, expands the route response_model to a union that includes ExecutionAcceptedResponse, allowing FastAPI to validate and serialize the HTTP 202 HITL response correctly.
/v1/workflow endpoint (generate.py)
Updated add_generate_route to call _interactive_response_model for the SINGLE endpoint type and added 202: {"model": ExecutionAcceptedResponse} to the OpenAPI responses dict.
/v1/chat/completions endpoint (v1_chat_completions.py)
Updated add_v1_chat_completions_route to call _interactive_response_model instead of a hardcoded union, keeping the pattern consistent with the generate route.

Tests
Added two mock-based regression tests to test_fastapi_front_end_plugin.py:

test_workflow_single_hitl_pause_returns_202 — asserts /v1/workflow returns HTTP 202 and a fully-populated ExecutionAcceptedInteraction body (execution ID, interaction ID, prompt, status URL, response URL) when a workflow pauses for human input.
test_v1_chat_completions_hitl_pause_returns_202 — same contract for /v1/chat/completions.

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

…tive mode is enabled Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai · 2026-06-01T18:56:37Z

Walkthrough

This PR adds interactive human-in-the-loop (HITL) mode support to FastAPI routes by introducing a conditional response model helper that optionally includes ExecutionAcceptedResponse (HTTP 202) when interactive mode is enabled, applies this pattern to generate and chat completions endpoints, and validates the behavior with integration tests.

Changes

Interactive HITL Mode Support

Layer / File(s)	Summary
Interactive response model helper `packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/common_utils.py`	Adds `ExecutionAcceptedResponse` import and defines `_interactive_response_model(response_type, enable_interactive)` function that returns the original response type when interactive mode is disabled, or a union type including `ExecutionAcceptedResponse` when enabled.
Generate endpoint interactive mode `packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`	Imports interactive response modeling utilities and updates the SINGLE endpoint route registration to use `_interactive_response_model` for conditional response schema generation and documents HTTP 202 with `ExecutionAcceptedResponse` in OpenAPI metadata.
Chat completions endpoint interactive mode `packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/v1_chat_completions.py`	Imports interactive response typing helpers and updates the v1 chat completions route to compute response model via `_interactive_response_model`, extending OpenAPI responses metadata to include HTTP 202 with `ExecutionAcceptedResponse`.
HITL behavior validation tests `packages/nvidia_nat_core/tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py`	Adds test infrastructure for mocking and in-process HTTP client testing, imports HITL-specific models and route builders, and validates that both `/v1/workflow` and `/v1/chat/completions` endpoints return HTTP 202 with `ExecutionAcceptedInteraction` bodies when interactive mode is enabled and an interaction is required.

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main fix: enabling ExecutionAcceptedResponse for interactive endpoints. However, the title provided (75 chars) slightly exceeds the recommended max of ~72 characters.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ugfix/hitl-v1workflow-response-validation

coderabbitai

🧹 Nitpick comments (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py (1)

184-189: ⚡ Quick win

Gate the documented 202 response on enable_interactive.

The response_model is correctly conditioned via _interactive_response_model(..., enable_interactive), but the responses map unconditionally documents 202: ExecutionAcceptedResponse. For non-interactive SINGLE routes (e.g. the legacy path registered with enable_interactive=False at Lines 303-309), the handler never returns 202, so the OpenAPI schema advertises a response that can't occur, which misleads generated clients.

♻️ Proposed fix to keep OpenAPI in sync with handler behavior

-                response_model=_interactive_response_model(response_type, enable_interactive),
-                responses={
-                    500: RESPONSE_500, 202: {
-                        "model": ExecutionAcceptedResponse
-                    }
-                },
+                response_model=_interactive_response_model(response_type, enable_interactive),
+                responses={
+                    500: RESPONSE_500,
+                    **({
+                        202: {
+                            "model": ExecutionAcceptedResponse
+                        }
+                    } if enable_interactive else {}),
+                },

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`
around lines 184 - 189, The OpenAPI responses map unconditionally includes 202:
ExecutionAcceptedResponse while the route's response_model is already gated by
_interactive_response_model(..., enable_interactive); update the route
registration to build the responses dict dynamically: start with responses =
{500: RESPONSE_500} and only add responses[202] = {"model":
ExecutionAcceptedResponse} when enable_interactive is True, then pass that
responses variable into the route decorator so the documented 202 is present
only when enable_interactive is enabled.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py`:
- Around line 184-189: The OpenAPI responses map unconditionally includes 202:
ExecutionAcceptedResponse while the route's response_model is already gated by
_interactive_response_model(..., enable_interactive); update the route
registration to build the responses dict dynamically: start with responses =
{500: RESPONSE_500} and only add responses[202] = {"model":
ExecutionAcceptedResponse} when enable_interactive is True, then pass that
responses variable into the route decorator so the documented 202 is present
only when enable_interactive is enabled.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2e882302-2b4c-4a9d-88ea-c5bdf915ccc8

📥 Commits

Reviewing files that changed from the base of the PR and between e4c5f08 and 77a613d.

📒 Files selected for processing (4)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/common_utils.py
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/generate.py
packages/nvidia_nat_core/src/nat/front_ends/fastapi/routes/v1_chat_completions.py
packages/nvidia_nat_core/tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py

ericevans-nv · 2026-06-01T19:54:38Z

/merge

fix: include ExecutionAcceptedResponse in response_model when interac…

77a613d

…tive mode is enabled Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

ericevans-nv requested a review from a team as a code owner June 1, 2026 18:56

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into b…

24bfe12

…ugfix/hitl-v1workflow-response-validation

ericevans-nv self-assigned this Jun 1, 2026

ericevans-nv added non-breaking Non-breaking change bug Something isn't working labels Jun 1, 2026

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

ericevans-nv changed the title ~~fix: include ExecutionAcceptedResponse in response_model when interactive mode is enabled~~ fix: non-streaming endpoints return HTTP 500 instead of 202 on HITL pause Jun 1, 2026

ericevans-nv changed the title ~~fix: non-streaming endpoints return HTTP 500 instead of 202 on HITL pause~~ Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints Jun 1, 2026

willkill07 approved these changes Jun 1, 2026

View reviewed changes

rapids-bot Bot merged commit 96e0dda into NVIDIA:develop Jun 1, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints#1999

Fix HTTP 500 on human-in-the-loop pause for non-streaming FastAPI endpoints#1999
rapids-bot[bot] merged 2 commits into
NVIDIA:developfrom
ericevans-nv:bugfix/hitl-v1workflow-response-validation

ericevans-nv commented Jun 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ericevans-nv commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericevans-nv commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ericevans-nv commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericevans-nv commented Jun 1, 2026 •

edited

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading