Avoid retrying object_exists 404 responses#458
Conversation
📝 WalkthroughWalkthroughBucket::object_exists now calls Request::response_status() and returns false only for HTTP 404; Request gained a response_status implementation in tokio, async-std, and blocking backends that executes the request under existing retry logic but treats 404 as a non-retryable success path. A Tokio test verifies no retry on 404. ChangesStatus Extraction Optimization
Sequence Diagram(s)sequenceDiagram
participant Bucket as Bucket::object_exists
participant Req as Request::response_status
participant Retry as crate::retry!
participant Server as Remote S3 / TCP server
Bucket->>Req: call response_status(path)
Req->>Retry: execute HTTP request
Retry->>Server: TCP/HTTP request
Server-->>Retry: 404 Not Found
Retry-->>Req: returns 404 status
Req-->>Bucket: Ok(404)
Bucket-->>Bucket: returns false (status == 404)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 3/8 reviews remaining, refill in 36 minutes and 25 seconds.Comment |
Brownian Motion (Brass)Recommendation: Proceed Summary: PR effectively addresses unnecessary retries for 404 in object_exists. Highlights
Unknowns
Next actions
Reflection questions
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
s3/src/request/request_trait.rs (1)
213-216: ⚡ Quick winDocument the override contract for
response_status().This new public trait method currently relies on callers discovering from the backend impls that the default path just delegates to
response_header(). Please add rustdoc with an example and explicitly note that backends should override it when they need status-only classification before error/retry handling kicks in.As per coding guidelines, "All public APIs should have documentation examples".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@s3/src/request/request_trait.rs` around lines 213 - 216, Add rustdoc for the public trait method response_status() in request_trait.rs that documents its override contract: explain that the default implementation simply delegates to response_header(), show a short example snippet (illustrating overriding response_status() to return just the status code without calling response_header()), and explicitly state that backend implementations should override response_status() when they need status-only classification prior to error/retry handling; reference the related method response_header() in the doc and include a one-line code example per the public API docs guideline.s3/src/bucket.rs (1)
3123-3174: ⚡ Quick winAdd the same no-retry regression test for async-std and blocking.
The behavior change lives in three backend-specific
response_status()overrides, but only the Tokio path is runtime-tested here. Compile checks won't catch a backend-specific regression in the 404 classification/retry behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@s3/src/bucket.rs` around lines 3123 - 3174, The new Tokio-only test test_object_exists_404_does_not_retry only verifies the no-retry 404 behavior for the Tokio backend, so add equivalent tests for the async-std and blocking runtimes to prevent backend-specific regressions: create duplicate async tests decorated with #[cfg(all(not(feature = "sync"), feature = "with-async-std"))] and #[async_std::test] using the same logic as test_object_exists_404_does_not_retry, and a blocking test under #[cfg(feature = "sync")] (e.g., using std::thread::spawn and blocking Bucket::object_exists calls) so the three backend-specific response_status() overrides are covered; ensure each test uses the same request counter, TcpListener endpoint, crate::set_retries calls, and assertions (exists is false and requests == 1).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@s3/src/bucket.rs`:
- Around line 1028-1029: The current check treats any non-404 status as
existence because it returns Ok(status_code != 404); change this to return
Ok(true) only when the response status is a success (2xx) and return Ok(false)
for 404, but surface any other status as an error. Locate the call to
request.response_status().await? and replace the boolean expression with a
branch that uses status.is_success() (or checks 200..=299) to return Ok(true),
explicitly handle 404 to return Ok(false), and convert all other status codes
into an Err (propagating or mapping into the function’s error type) so
auth/server errors aren’t masked.
---
Nitpick comments:
In `@s3/src/bucket.rs`:
- Around line 3123-3174: The new Tokio-only test
test_object_exists_404_does_not_retry only verifies the no-retry 404 behavior
for the Tokio backend, so add equivalent tests for the async-std and blocking
runtimes to prevent backend-specific regressions: create duplicate async tests
decorated with #[cfg(all(not(feature = "sync"), feature = "with-async-std"))]
and #[async_std::test] using the same logic as
test_object_exists_404_does_not_retry, and a blocking test under #[cfg(feature =
"sync")] (e.g., using std::thread::spawn and blocking Bucket::object_exists
calls) so the three backend-specific response_status() overrides are covered;
ensure each test uses the same request counter, TcpListener endpoint,
crate::set_retries calls, and assertions (exists is false and requests == 1).
In `@s3/src/request/request_trait.rs`:
- Around line 213-216: Add rustdoc for the public trait method response_status()
in request_trait.rs that documents its override contract: explain that the
default implementation simply delegates to response_header(), show a short
example snippet (illustrating overriding response_status() to return just the
status code without calling response_header()), and explicitly state that
backend implementations should override response_status() when they need
status-only classification prior to error/retry handling; reference the related
method response_header() in the doc and include a one-line code example per the
public API docs guideline.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d7349aff-d15a-4e27-9643-ab32617a8679
📒 Files selected for processing (5)
s3/src/bucket.rss3/src/request/async_std_backend.rss3/src/request/blocking.rss3/src/request/request_trait.rss3/src/request/tokio_backend.rs
| let status_code = request.response_status().await?; | ||
| Ok(status_code != 404) |
There was a problem hiding this comment.
Only treat successful HEAD responses as existence.
response_status() returns raw non-2xx statuses when fail-on-err is disabled, so this now reports Ok(true) for cases like 403, 500, or a redirect. That hides auth/server problems as “object exists”. Please keep 404 => false, return true only for success statuses, and surface every other status as an error.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@s3/src/bucket.rs` around lines 1028 - 1029, The current check treats any
non-404 status as existence because it returns Ok(status_code != 404); change
this to return Ok(true) only when the response status is a success (2xx) and
return Ok(false) for 404, but surface any other status as an error. Locate the
call to request.response_status().await? and replace the boolean expression with
a branch that uses status.is_success() (or checks 200..=299) to return Ok(true),
explicitly handle 404 to return Ok(false), and convert all other status codes
into an Err (propagating or mapping into the function’s error type) so
auth/server errors aren’t masked.
8713dcb to
4b6cc05
Compare
Brownian Motion (Brass)Recommendation: Proceed Summary: PR effectively addresses unnecessary retries for 404 responses in object_exists. Highlights
Unknowns
Next actions
Reflection questions
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
s3/src/bucket.rs (1)
1028-1029:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNon-404, non-2xx statuses are silently reported as "object exists".
When
fail-on-erris disabled (the default),response_status()returnsOk(status)for statuses like 403, 500, or 301. The expressionstatus_code != 404then evaluates totrue, soobject_existsreturnsOk(true)for permission errors, server errors, and redirects — masking real problems as "object exists".The fix belongs in
object_exists: treat only 2xx as existence and surface everything else as an error:🐛 Proposed fix
- let status_code = request.response_status().await?; - Ok(status_code != 404) + match request.response_status().await? { + 404 => Ok(false), + s if (200..300).contains(&(s as u32)) => Ok(true), + s => Err(S3Error::HttpFailWithBody(s, String::new())), + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@s3/src/bucket.rs` around lines 1028 - 1029, The current object_exists implementation treats any non-404 response as existence because it returns Ok(status_code != 404); change object_exists to consider only 2xx responses as existence and treat all other statuses (including 3xx, 4xx != 404, 5xx) as errors: call response_status(), check status.is_success() (or status.as_u16() in range 200..=299) and return Ok(true) only for 2xx, return Ok(false) only for 404, and for any other status return Err(...) (propagate or map the non-2xx status into a descriptive error). Update logic around response_status() and the object_exists function to surface non-2xx statuses instead of reporting them as existence.
🧹 Nitpick comments (1)
s3/src/request/tokio_backend.rs (1)
120-168: 🏗️ Heavy lift
response_statusduplicates the entire request-building block fromresponse().The ~20-line sequence (header construction, client retrieval, method matching, URL/body assembly,
client.execute) is copied verbatim fromresponse()(lines 76–109). Any future change to request construction must be maintained in two places.
response_status()cannot simply delegate toself.response()becauseresponse()already applies thefail-on-errcheck and would convert a 404 intoErrbeforeresponse_statuscan intercept it. The clean fix is to extract a privateexecute_rawhelper that builds and fires the request without thefail-on-errguard, then have both methods call it:♻️ Suggested refactor
+ // Private: build and fire the request without any error-on-status logic. + async fn execute_raw(&self) -> Result<reqwest::Response, S3Error> { + let headers = self + .headers() + .await? + .iter() + .map(|(k, v)| { + ( + reqwest::header::HeaderName::from_str(k.as_str()), + reqwest::header::HeaderValue::from_str(v.to_str().unwrap_or_default()), + ) + }) + .filter(|(k, v)| k.is_ok() && v.is_ok()) + .map(|(k, v)| (k.unwrap(), v.unwrap())) + .collect(); + + let client = self.bucket.http_client(); + let method = match self.command.http_verb() { + HttpMethod::Delete => reqwest::Method::DELETE, + HttpMethod::Get => reqwest::Method::GET, + HttpMethod::Post => reqwest::Method::POST, + HttpMethod::Put => reqwest::Method::PUT, + HttpMethod::Head => reqwest::Method::HEAD, + }; + let request = client + .request(method, self.url()?.as_str()) + .headers(headers) + .body(self.request_body()?) + .build()?; + Ok(client.execute(request).await?) + } async fn response(&self) -> Result<Self::Response, S3Error> { - let headers = self - .headers() - // ... (full duplicated block) - let response = client.execute(request).await?; + let response = self.execute_raw().await?; if cfg!(feature = "fail-on-err") && !response.status().is_success() { let status = response.status().as_u16(); let text = response.text().await?; return Err(S3Error::HttpFailWithBody(status, text)); } Ok(response) } async fn response_status(&self) -> Result<u16, S3Error> { retry! { async { - let headers = self - .headers() - // ... (full duplicated block) - let response = client.execute(request).await?; - let status = response.status().as_u16(); + let response = self.execute_raw().await?; + let status = response.status().as_u16(); if status == 404 { return Ok(status); } if cfg!(feature = "fail-on-err") && !response.status().is_success() { let text = response.text().await?; return Err(S3Error::HttpFailWithBody(status, text)); } Ok(status) }.await } }Note:
execute_rawshould live in a separateimpl<'a> ReqwestRequest<'a>block (outside the#[maybe_async]-annotated trait impl) so it isn't subject to the trait's async/sync transformation. The same pattern should be applied to the analogous backends (async_std_backend.rs,blocking.rs).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@s3/src/request/tokio_backend.rs` around lines 120 - 168, response_status currently duplicates request construction from response; extract a private helper (e.g., execute_raw) on the ReqwestRequest type that builds and executes the reqwest Request and returns the raw Response (without applying the fail-on-err logic or converting 404s), place execute_raw in a separate impl<'a> ReqwestRequest<'a> block (outside the #[maybe_async] trait impl) so it won't be transformed, then refactor response() to call execute_raw and apply the existing fail-on-err handling there, and refactor response_status() to call execute_raw and inspect/return the status (handling 404 specially) — replicate this pattern for async_std_backend.rs and blocking.rs as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@s3/src/bucket.rs`:
- Around line 3123-3174: The test test_object_exists_404_does_not_retry modifies
global retry state via crate::set_retries(1) without restoring it, which can
leak into other tests; change the test to capture the current retry count before
setting (e.g., let prev = crate::get_retries() or otherwise read the current
value), call crate::set_retries(1) for the duration of the test, and then
restore the original value at the end (crate::set_retries(prev)) — ensure
restoration runs even if the test panics by using a guard/RAII pattern or a
finally-style drop guard inside test_object_exists_404_does_not_retry so the
global retries are returned to their original value.
---
Duplicate comments:
In `@s3/src/bucket.rs`:
- Around line 1028-1029: The current object_exists implementation treats any
non-404 response as existence because it returns Ok(status_code != 404); change
object_exists to consider only 2xx responses as existence and treat all other
statuses (including 3xx, 4xx != 404, 5xx) as errors: call response_status(),
check status.is_success() (or status.as_u16() in range 200..=299) and return
Ok(true) only for 2xx, return Ok(false) only for 404, and for any other status
return Err(...) (propagate or map the non-2xx status into a descriptive error).
Update logic around response_status() and the object_exists function to surface
non-2xx statuses instead of reporting them as existence.
---
Nitpick comments:
In `@s3/src/request/tokio_backend.rs`:
- Around line 120-168: response_status currently duplicates request construction
from response; extract a private helper (e.g., execute_raw) on the
ReqwestRequest type that builds and executes the reqwest Request and returns the
raw Response (without applying the fail-on-err logic or converting 404s), place
execute_raw in a separate impl<'a> ReqwestRequest<'a> block (outside the
#[maybe_async] trait impl) so it won't be transformed, then refactor response()
to call execute_raw and apply the existing fail-on-err handling there, and
refactor response_status() to call execute_raw and inspect/return the status
(handling 404 specially) — replicate this pattern for async_std_backend.rs and
blocking.rs as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b6be0e67-4f0b-4434-9b93-2cc4561c3608
📒 Files selected for processing (5)
s3/src/bucket.rss3/src/request/async_std_backend.rss3/src/request/blocking.rss3/src/request/request_trait.rss3/src/request/tokio_backend.rs
✅ Files skipped from review due to trivial changes (1)
- s3/src/request/blocking.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- s3/src/request/async_std_backend.rs
- s3/src/request/request_trait.rs
| #[cfg(all(not(feature = "sync"), feature = "with-tokio"))] | ||
| #[tokio::test] | ||
| async fn test_object_exists_404_does_not_retry() { | ||
| init(); | ||
|
|
||
| let listener = TcpListener::bind("127.0.0.1:0").unwrap(); | ||
| let endpoint = format!("http://{}", listener.local_addr().unwrap()); | ||
| let requests = Arc::new(AtomicUsize::new(0)); | ||
| let request_count = Arc::clone(&requests); | ||
|
|
||
| let server = thread::spawn(move || { | ||
| let (mut stream, _) = listener.accept().unwrap(); | ||
| request_count.fetch_add(1, Ordering::SeqCst); | ||
|
|
||
| let mut buffer = [0; 2048]; | ||
| let _ = stream.read(&mut buffer).unwrap(); | ||
| stream | ||
| .write_all( | ||
| b"HTTP/1.1 404 Not Found\r\nContent-Length: 0\r\nConnection: close\r\n\r\n", | ||
| ) | ||
| .unwrap(); | ||
| }); | ||
|
|
||
| crate::set_retries(1); | ||
|
|
||
| let credentials = Credentials::new( | ||
| Some("test_access_key"), | ||
| Some("test_secret_key"), | ||
| None, | ||
| None, | ||
| None, | ||
| ) | ||
| .unwrap(); | ||
| let bucket = Bucket::new( | ||
| "test-bucket", | ||
| Region::Custom { | ||
| region: "us-east-1".to_owned(), | ||
| endpoint, | ||
| }, | ||
| credentials, | ||
| ) | ||
| .unwrap() | ||
| .with_path_style(); | ||
|
|
||
| let exists = bucket.object_exists("/missing.txt").await.unwrap(); | ||
|
|
||
| crate::set_retries(1); | ||
| server.join().unwrap(); | ||
|
|
||
| assert!(!exists); | ||
| assert_eq!(requests.load(Ordering::SeqCst), 1); | ||
| } |
There was a problem hiding this comment.
LGTM — test correctly enforces the no-retry invariant.
The single-accept server design is intentional and effective: if object_exists retried, the second attempt would receive ECONNREFUSED, causing .unwrap() at line 3167 to panic and failing the test. The requests counter then provides a precise assertion that only one request was made.
One note: crate::set_retries(1) at line 3169 is global state and does not restore the previous default — it just sets the same value that was already set at line 3146. If the library's default retry count is greater than 1, tests running after this one in the same process will unexpectedly inherit retries = 1. Consider saving and restoring the original value, or using a test-local mechanism if the API supports it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@s3/src/bucket.rs` around lines 3123 - 3174, The test
test_object_exists_404_does_not_retry modifies global retry state via
crate::set_retries(1) without restoring it, which can leak into other tests;
change the test to capture the current retry count before setting (e.g., let
prev = crate::get_retries() or otherwise read the current value), call
crate::set_retries(1) for the duration of the test, and then restore the
original value at the end (crate::set_retries(prev)) — ensure restoration runs
even if the test panics by using a guard/RAII pattern or a finally-style drop
guard inside test_object_exists_404_does_not_retry so the global retries are
returned to their original value.
|
Thanks for tightening the Verification before merge:
Merging this as the fix for #444. |
Linked issue: Fixes #444
Problem
Bucket::object_existsusedresponse_data(false), whose backend path wrapsresponse()inretry!. Withfail-on-err, an expected HEAD 404 is converted to an error beforeobject_existscan classify it, so the retry macro delays and logs warnings for a terminal “object missing” result.Solution
Add an internal
Request::response_statushook for status-only calls and haveobject_existsuse it. Built-in tokio, async-std, and sync backends classify status 404 asOk(404)beforeretry!sees an error, while preserving existingfail-on-errbehavior for non-404 failures.object_existsnow returnsOk(false)for 404 andOk(true)for any other successful status.Focused verification
cargo test -p rust-s3 test_object_exists_404_does_not_retry --features tokio-native-tls -- --nocapture— passed; regression test observed exactly one local HEAD request for a 404.cargo check -p rust-s3 --features tokio-native-tls,fail-on-err— passed.cargo check -p rust-s3 --no-default-features --features sync-native-tls,fail-on-err— passed.cargo check -p rust-s3 --no-default-features --features async-std-native-tls,fail-on-err— passed.Risk notes
response_statusremains wrapped inretry!.fail-on-errsemantics for ordinary response/body methods are unchanged; onlyobject_existsmoves to the status-only path.BucketAPI changes. The new trait method has a default implementation to avoid breaking external implementors, with built-in backend overrides for the no-retry 404 behavior.Self-review
Ok(false)before retry; non-404 failures still return errors through backend fail-on-err logic.object_existssignature and ordinary method behavior unchanged.This change is
Summary by CodeRabbit
Bug Fixes
Tests