Skip to content

fix: add upload retry#399

Open
not-matthias wants to merge 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts
Open

fix: add upload retry#399
not-matthias wants to merge 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts

Conversation

@not-matthias

@not-matthias not-matthias commented Jun 11, 2026

Copy link
Copy Markdown
Member

No description provided.

@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 7 untouched benchmarks


Comparing cod-2839-retry-upload-on-connection-errorstimeouts (2298ed2) with main (41f4db9)

Open in CodSpeed

@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from df83123 to bee5a31 Compare June 11, 2026 14:41
@not-matthias not-matthias marked this pull request as ready for review June 11, 2026 16:48
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds upload retry support for the disk-backed (WallTime/Memory) streaming path, which previously sent exactly one attempt because the reqwest-middleware retry middleware cannot replay a consumed stream body. A new send_streamed_with_retry function manually re-opens the file on each attempt and reuses the same ExponentialBackoff policy and DefaultRetryableStrategy classifier as the existing middleware-backed in-memory path.

  • upload_backoff() is extracted to request_client.rs as a shared public function; it shrinks backoff intervals to milliseconds under #[cfg(test)] so retry tests complete quickly.
  • send_streamed_with_retry loops up to 1 + UPLOAD_RETRY_COUNT times, rebuilding the stream from disk on each attempt and computing the sleep duration from the policy's should_retry output.
  • Two integration tests using a raw TCP mock verify that both the streaming and in-memory paths retry exactly 1 + UPLOAD_RETRY_COUNT times before surfacing the error.

Confidence Score: 4/5

The change is safe to merge. The core retry logic is correct and directly mirrors how the existing middleware-backed path works; integration tests confirm both paths retry the expected number of times.

The retry loop correctly rebuilds the stream on each attempt, and the backoff policy and transient classification are shared with the proven middleware path. The two notes — a misleading doc comment in a test and the absence of a clarifying comment about non-success responses being returned as Ok — are cosmetic and do not affect runtime behavior.

No files require special attention, though the doc comment on in_memory_upload_is_retried and the fallthrough behavior in send_streamed_with_retry are worth a quick read.

Important Files Changed

Filename Overview
src/request_client.rs Extracts upload_backoff() as a shared public function with #[cfg(test)] bounds for fast test runs; REQUEST_CLIENT now calls it instead of inlining ExponentialBackoff::builder().
src/upload/uploader.rs Adds send_streamed_with_retry() to manually retry disk-backed stream uploads (previously unretried), reusing the shared backoff policy; includes two integration tests verifying retry counts for both the streaming and in-memory paths.

Sequence Diagram

sequenceDiagram
    participant U as upload_profile_archive
    participant S as send_streamed_with_retry
    participant P as ExponentialBackoff policy
    participant D as DefaultRetryableStrategy
    participant SC as STREAMING_CLIENT

    U->>S: path, upload_data, archive_size, hash, encoding
    S->>P: upload_backoff() — create policy
    loop up to 1 + UPLOAD_RETRY_COUNT attempts
        S->>SC: File::open(path) → ReaderStream → Body
        SC->>SC: PUT upload_url (stream body)
        SC-->>S: "Result<Response, Error>"
        S->>D: "handle(&result) — classify error"
        D-->>S: "Some(Transient) | Some(Fatal) | None"
        alt Transient AND policy allows retry
            S->>P: should_retry(start, n_past_retries)
            P-->>S: "RetryDecision::Retry { execute_after }"
            S->>S: sleep(execute_after - now)
            S->>S: "n_past_retries += 1"
        else Fatal / success / retries exhausted
            S-->>U: Ok(Response) or Err
        end
    end
    U->>U: check response.status().is_success()
Loading

Reviews (1): Last reviewed commit: "fix(upload): retry streamed uploads on t..." | Re-trigger Greptile

Comment thread src/upload/uploader.rs Outdated
Comment thread src/upload/uploader.rs
Streamed (on-disk) uploads went through STREAMING_CLIENT, which has no
retry middleware because a consumed stream body can't be replayed. Add a
manual retry loop that rebuilds the stream from disk on each attempt and
reuses reqwest_retry's backoff policy and transient classifier, matching
the in-memory path's behavior.

Extract the shared backoff into request_client::upload_backoff() so both
the retry middleware and the streamed loop use one policy; under cfg(test)
its intervals shrink to milliseconds to keep the retry tests fast. Add
tests asserting both the streamed and in-memory paths retry transient
503s 1 + UPLOAD_RETRY_COUNT times.
@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from bee5a31 to 2298ed2 Compare June 11, 2026 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant