Skip to content

Prevent REST response overflow and double-close failures#22356

Open
venkateshwaracholan wants to merge 2 commits into
opensearch-project:mainfrom
venkateshwaracholan:fix/bytes-rest-response-overflow
Open

Prevent REST response overflow and double-close failures#22356
venkateshwaracholan wants to merge 2 commits into
opensearch-project:mainfrom
venkateshwaracholan:fix/bytes-rest-response-overflow

Conversation

@venkateshwaracholan

Copy link
Copy Markdown

Summary

Fix two related bugs in the REST response path that can cause a silent ~5-minute hang (zero bytes flushed, LB 504) when search response bodies exceed ~715 MB.

  • Bug A: Prevent integer overflow during UTF-8 encoding of oversized response strings.
  • Bug B: Make ResourceHandlingHttpChannel and StreamHandlingHttpChannel close() idempotent so failure responses can still be sent after a partial sendResponse() failure.

Fixes #22311.


Problem

Large string responses are encoded through:

BytesRestResponse
  → BytesArray
    → BytesRef
      → UnicodeUtil.maxUTF8Length()

When the UTF-16 string length exceeds Integer.MAX_VALUE / 3, Lucene throws an ArithmeticException.

Because the response channel has already been partially closed, RestActionListener.onFailure() attempts another sendResponse(), which fails with IllegalStateException("Channel is already closed"), leaving the connection hanging until the load balancer times out.


Solution

Bug A

Validate the UTF-16 length before constructing BytesRef and return a typed OpenSearchStatusException (HTTP 413) instead of exposing an uncaught ArithmeticException.

The issue discussion mentioned either 413 or 507. This PR currently uses 413 and can be updated if maintainers prefer 507.

Bug B

Make channel close() idempotent so repeated calls become a no-op instead of throwing, allowing the failure response to be sent correctly.


Evidence / Tests

Test Verifies
BytesArrayTests.testOverflowGuardPreventsBytesRefIntegerOverflow Prevents integer overflow without allocating a ~715 MB string
BytesRestResponseTests.testToBytesArrayMapsOverflowGuardFailureToRequestEntityTooLarge Maps overflow to OpenSearchStatusException (HTTP 413)
RestControllerTests.testResourceHandlingChannelCloseIsIdempotent Multiple sendResponse() calls no longer fail due to repeated close()

Validation

./gradlew :libs:opensearch-core:test --tests "org.opensearch.core.common.bytes.BytesArrayTests"

./gradlew :server:test --tests "org.opensearch.rest.BytesRestResponseTests.testToBytesArrayMapsOverflowGuardFailureToRequestEntityTooLarge"

./gradlew :server:test --tests "org.opensearch.rest.RestControllerTests.testResourceHandlingChannelCloseIsIdempotent"

All tests pass locally.


Related


Checklist

  • Functionality includes tests.
  • API changes companion PR not required.
  • Documentation changes not required.

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Venkateshwaran Shanmugham <venkateshwaran@cloudera.com>
Signed-off-by: Venkateshwaran Shanmugham <venkateshwaracholan@gmail.com>
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

(Review updated until commit 354a74e)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Bug A: Prevent integer overflow during UTF-8 encoding of oversized response strings

Relevant files:

  • libs/core/src/main/java/org/opensearch/core/common/bytes/BytesArray.java
  • libs/core/src/test/java/org/opensearch/core/common/bytes/BytesArrayTests.java
  • server/src/main/java/org/opensearch/rest/BytesRestResponse.java
  • server/src/test/java/org/opensearch/rest/BytesRestResponseTests.java

Sub-PR theme: Bug B: Make channel close() idempotent to allow failure responses after partial sendResponse()

Relevant files:

  • server/src/main/java/org/opensearch/rest/RestController.java
  • server/src/test/java/org/opensearch/rest/RestControllerTests.java

⚡ Recommended focus areas for review

Test Validity

The testResourceHandlingChannelCloseIsIdempotent test uses MultiResponseChannel which directly extends AbstractRestChannel and bypasses ResourceHandlingHttpChannel entirely. The test never exercises the circuit-breaker wrapping logic that was actually changed. As a result, the test passes regardless of whether the idempotency fix is present, and does not reproduce Bug B as claimed.

public void testResourceHandlingChannelCloseIsIdempotent() {
    restController.registerHandler(RestRequest.Method.GET, "/repro-bug-b", (request, channel, client) -> {
        channel.sendResponse(new BytesRestResponse(RestStatus.OK, BytesRestResponse.TEXT_CONTENT_TYPE, BytesArray.EMPTY));
        channel.sendResponse(
            new BytesRestResponse(RestStatus.INTERNAL_SERVER_ERROR, BytesRestResponse.TEXT_CONTENT_TYPE, BytesArray.EMPTY)
        );
    });
    int contentLength = BREAKER_LIMIT.bytesAsInt();
    String content = randomAlphaOfLength((int) Math.round(contentLength / inFlightRequestsBreaker.getOverhead()));
    RestRequest request = testRestRequest("/repro-bug-b", content, MediaTypeRegistry.JSON);
    MultiResponseChannel channel = new MultiResponseChannel(request, true);

    restController.dispatchRequest(request, channel, client.threadPool().getThreadContext());

    assertEquals(2, channel.getResponseCount());
    assertEquals(RestStatus.INTERNAL_SERVER_ERROR, channel.getLastStatus());
    assertEquals(0, inFlightRequestsBreaker.getUsed());
}
Exception Swallowing

toBytesArray catches all IllegalArgumentException and maps them to HTTP 413. If BytesArray(String) throws an IllegalArgumentException for a reason unrelated to the overflow guard (e.g., a future change in BytesArray), the caller will silently receive a misleading 413 response instead of a proper error. The catch should be narrowed to only handle the specific overflow case, for example by checking the exception message or using a dedicated exception type.

static BytesArray toBytesArray(String content) {
    try {
        return new BytesArray(content);
    } catch (IllegalArgumentException e) {
        throw overflowGuardFailureToRequestEntityTooLarge(e);
    }
}

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Latest suggestions up to 354a74e
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent forwarding duplicate responses to delegate

Making close() idempotent allows sendResponse to be called multiple times, which
will forward multiple responses to delegate.sendResponse. Most HTTP transports do
not tolerate writing two responses on the same channel and may throw or corrupt the
wire protocol. Consider guarding delegate.sendResponse so it is only invoked on the
first call, while still keeping the breaker release idempotent.

server/src/main/java/org/opensearch/rest/RestController.java [659-668]

 public void sendResponse(RestResponse response) {
-    close();
-    delegate.sendResponse(response);
+    if (closed.compareAndSet(false, true)) {
+        inFlightRequestsBreaker(circuitBreakerService).addWithoutBreaking(-contentLength);
+        delegate.sendResponse(response);
+    }
 }
Suggestion importance[1-10]: 7

__

Why: Valid concern: making close() idempotent while still allowing multiple delegate.sendResponse calls could cause HTTP protocol issues. However, the PR's test explicitly expects two responses to be sent, suggesting this is intentional behavior, so the suggestion may conflict with PR intent.

Medium
General
Also reject negative length values

Negative utf16Length values pass the overflow check (since (long) negative * 3 is
not > Integer.MAX_VALUE) but would also be invalid input. Consider also rejecting
negative lengths to provide clearer error semantics and avoid passing nonsensical
values downstream.

libs/core/src/main/java/org/opensearch/core/common/bytes/BytesArray.java [75-85]

 public static void ensureUTF16LengthIsValidForUTF8Encoding(int utf16Length) {
-    if ((long) utf16Length * MAX_UTF8_BYTES_PER_CHAR > Integer.MAX_VALUE) {
+    if (utf16Length < 0 || (long) utf16Length * MAX_UTF8_BYTES_PER_CHAR > Integer.MAX_VALUE) {
         throw new IllegalArgumentException(
             "UTF16 string length ["
                 + utf16Length
                 + "] exceeds maximum ["
                 + MAX_UTF16_LENGTH_FOR_UTF8
                 + "] that can be UTF-8 encoded without integer overflow"
         );
     }
 }
Suggestion importance[1-10]: 3

__

Why: Minor input validation improvement; negative lengths are unlikely in practice since they come from String.length(), but adding the check provides clearer error semantics.

Low
Verify constructor delegation parameter order

The constructor BytesRestResponse(RestStatus, String) now delegates to
BytesRestResponse(RestStatus, String, String) via this(status, TEXT_CONTENT_TYPE,
content), but the second parameter of the three-arg constructor is contentType, not
the existing String content. Verify the constructor signatures: if the three-arg
constructor is (RestStatus, String contentType, String content), this call is
correct, but ensure no overload resolution ambiguity exists with (RestStatus,
String, BytesReference).

server/src/main/java/org/opensearch/rest/BytesRestResponse.java [81-83]

-public BytesRestResponse(RestStatus status, XContentBuilder builder) {
-    this(status, builder.contentType().mediaType(), BytesReference.bytes(builder));
-}
-
-/**
- * Creates a new plain text response.
- */
 public BytesRestResponse(RestStatus status, String content) {
     this(status, TEXT_CONTENT_TYPE, content);
 }
Suggestion importance[1-10]: 2

__

Why: This is merely a verification request without identifying a concrete issue; the existing constructor signatures appear correct.

Low

Previous suggestions

Suggestions up to commit dd9846a
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid forwarding duplicate responses to delegate

Making close() idempotent silently allows multiple sendResponse calls to be
forwarded to the delegate, which can violate the HTTP channel contract (delegates
typically throw on double-send) and produce malformed responses. Guard sendResponse
itself so only the first response is forwarded to the delegate while subsequent
calls are dropped (or logged), preserving idempotency of the breaker accounting
without forwarding duplicate responses downstream.

server/src/main/java/org/opensearch/rest/RestController.java [664-667]

 public void sendResponse(RestResponse response) {
-    close();
-    delegate.sendResponse(response);
+    if (closed.compareAndSet(false, true)) {
+        inFlightRequestsBreaker(circuitBreakerService).addWithoutBreaking(-contentLength);
+        delegate.sendResponse(response);
+    }
 }
Suggestion importance[1-10]: 7

__

Why: Valid concern: making close() idempotent allows duplicate sendResponse calls to reach the delegate, which could cause downstream errors or malformed responses. However, the PR's test explicitly verifies that two sendResponse calls succeed, so this may conflict with the intended behavior.

Medium
Align overflow guard with Lucene constant

Lucene's UnicodeUtil#MAX_UTF8_BYTES_PER_CHAR is actually 4 (UTF-8 can require up to
4 bytes per code unit pair, and Lucene's maxUTF8Length uses 4). Using 3 here
under-estimates the worst case and will still allow strings that overflow
maxUTF8Length when computing utf16Length * 4. Use 4 to align with Lucene and truly
prevent the overflow.

libs/core/src/main/java/org/opensearch/core/common/bytes/BytesArray.java [54-55]

-public static void ensureUTF16LengthIsValidForUTF8Encoding(int utf16Length) {
-    if ((long) utf16Length * MAX_UTF8_BYTES_PER_CHAR > Integer.MAX_VALUE) {
+private static final int MAX_UTF8_BYTES_PER_CHAR = 4;
+private static final int MAX_UTF16_LENGTH_FOR_UTF8 = Integer.MAX_VALUE / MAX_UTF8_BYTES_PER_CHAR;
Suggestion importance[1-10]: 6

__

Why: Lucene's UnicodeUtil.MAX_UTF8_BYTES_PER_CHAR is actually 3 (since Java chars are UTF-16 code units and supplementary chars use surrogate pairs), so the suggestion's premise is questionable. However, Lucene's maxUTF8Length does use a factor that warrants verification against the actual constant.

Low

@github-actions

Copy link
Copy Markdown
Contributor

❌ Gradle check result for dd9846a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Venkateshwaran Shanmugham <venkateshwaracholan@gmail.com>
@venkateshwaracholan venkateshwaracholan force-pushed the fix/bytes-rest-response-overflow branch from e0b5118 to 354a74e Compare June 30, 2026 18:50
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 354a74e

@github-actions

Copy link
Copy Markdown
Contributor

✅ Gradle check result for 354a74e: SUCCESS

@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.47059% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.36%. Comparing base (319d78a) to head (354a74e).

Files with missing lines Patch % Lines
...in/java/org/opensearch/rest/BytesRestResponse.java 71.42% 2 Missing ⚠️
.../main/java/org/opensearch/rest/RestController.java 50.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #22356      +/-   ##
============================================
- Coverage     73.40%   73.36%   -0.05%     
+ Complexity    76075    76038      -37     
============================================
  Files          6076     6076              
  Lines        345500   345507       +7     
  Branches      49732    49732              
============================================
- Hits         253608   253474     -134     
- Misses        71707    71778      +71     
- Partials      20185    20255      +70     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] ArithmeticException integer overflow in BytesRestResponse causes silent 504 hang for responses >= ~715 MB

1 participant