[Driver] Stream the auction JSON body to S3 + solvers by jmg-duarte · Pull Request #4575 · cowprotocol/services

jmg-duarte · 2026-06-29T14:50:13Z

Description

Streams the auction JSON through the network, avoiding the large memory spike from generating it upfront and then working with it.

The implementation tee's the JSON to the solvers and S3; in the S3 path it gets gzipped first (if only brotli was properly supported 😮‍💨)

Following numbers are with 8 chunks in memory

Big picture (Arb1 shadow, Arb1 Staging, Avalanche Staging)

Avalanche, in yellow, before the patch, in blue after (with a trade)

Longer time horizon — note the size of the spikes in orange VS the purple ones

These numbers are with 2 chunks

In blue, orders I placed

Production, Base, avg_over_time CPU

Production, Base, current + avg_over_time Memory

The tests above were carried out without 381aa8a which we've seen reduce the memory spikes

Changes

New tee stream
New chunking stream

How to test

Graphs above, S3 auctions are intact

MartinquaXD · 2026-06-29T15:19:08Z

+    S: Write + Finalize + Send + 'static,
+{
+    let (tx, rx) = mpsc::channel::<std::io::Result<Bytes>>(CHANNEL_CAPACITY);
+    let body = reqwest::Body::wrap_stream(ReceiverStream::new(rx));


Can we use the ByteStream that currently lives in the autopilot? That would automatically provide metrics about how long each solver needs to stream the bytes which can be very helpful to check if an issue is on our end or on the solver's end.

done but this makes the PR bigger and a bit harder to reason about, i'd extract it to reduce burden on you guys

Okay. Then let's leave this for later to not have too many changes in the same PR.

Review fixes that don't depend on the transmission-timing wrapper: - Reduce CHANNEL_CAPACITY 8 -> 2; double buffering is enough to keep the consumer fed without raising the memory ceiling. - Carry raw `Bytes` over the body channel and wrap them into infallible `Ok`s at the `wrap_stream` boundary instead of channeling `Result`s. - Move the gzip compression policy into the `s3` crate via `GzipWriter`, so the streamed archive copy is compressed with the same settings as the eager upload path. - Use the `mod.rs` layout for the `streaming` module. - Note the UNIX `tee` lineage in `TeeWriter`'s docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Extract the request-body transmission timing that lived in autopilot's `ByteStream` into a generic `observe::http_body::Measured` stream wrapper, and apply it to the driver's streamed `/solve` body so the driver->solver hop also reports `to_transmission_start_ms` / `transmission_ms`. autopilot now wraps a plain single-chunk stream with `Measured` and drops the bespoke `ByteStream` type, which was only a `Bytes`-to-stream adapter once its timing moved out. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

claude · 2026-06-30T11:05:28Z

Claude finished @jmg-duarte's task in 3m 11s —— View job

PR Review: Stream the auction JSON body to S3 + solvers

Nice change — the tee + chunked-streaming approach is clean and the memory-spike graphs are convincing. The TeeWriter/Finalize/GzipCapture decomposition reads well and the tests cover the important paths (byte forwarding + gzip round-trip). A few things worth a look before merge, posted inline:

Read changed files and understand the streaming/tee implementation
Review for correctness, error handling, and concurrency issues
Post inline comments
Post summary

Findings (inline):

serialize_request overhead metric flatlines — serialization is now deferred into the streaming task, so the measurement right after building the body captures near-zero instead of the actual serialize cost. (solver/mod.rs:410)
Archival is now coupled to solver-request success — if the solver connection drops/times out mid-upload, serialization aborts before finalize() and the S3 archive is silently skipped (regression vs. the eager archive_auction on main). Also flags that the spawn_blocking thread is now held for the whole upload. (solver/mod.rs:400)
Truncated body on serialization error (low risk) — the Infallible stream can't abort the request, so a mid-stream serialize error sends a truncated-but-complete-looking body. Noted as the trade-off behind the Infallible simplification. (streaming/mod.rs:56)

Nothing blocking from a pure-correctness standpoint; #2 is the one I'd most want a deliberate decision on (is dropping the archive on a failed solver request acceptable?). The s3-crate-vs-driver compression placement and ByteStream metrics points were already discussed and deferred, so I didn't re-raise them.

claude · 2026-06-30T11:08:14Z

+        if auction.id().is_some() {
            ::observe::metrics::metrics().measure_auction_overhead(
                start,
                "driver",
                "serialize_request",
            );
        }


The serialize_request overhead metric no longer measures serialization. stream_body/stream_body_and_gzip return immediately with a lazy body handle — the actual serde_json::to_writer work now happens inside the spawn_blocking task, driven by network backpressure. So start..now here captures only the DTO construction + channel setup (near-zero), not the serialization cost it used to measure on the eager path.

If the intent is to keep tracking serialization time, the measurement needs to move into the streaming task (e.g. around serde_json::to_writer); otherwise this metric will silently flatline after this change.

I'm not sure it's possible to measure this without the stream wrapper we discussed

claude · 2026-06-30T11:08:25Z

+        let body: reqwest::Body = match archive_id {
+            // Stream the request body while capturing a gzipped copy for S3, so
+            // neither the request nor the archive holds the full JSON at once.
+            Some(id) => {
+                let (body, compressed) = streaming::stream_body_and_gzip(auction_dto);
+                self.persistence.archive_auction_gzipped(id, compressed);
+                body
+            }
+            None => streaming::stream_body(auction_dto),
+        };


Behavior change worth a second look: S3 archival is now coupled to the solver request streaming successfully to completion.

On main, the auction was serialized eagerly into Bytes and archive_auction uploaded that buffer independently of the solver request outcome. Here, the gzipped bytes only reach archive_auction_gzipped when finalize() runs, which only happens if serde_json::to_writer completes — and that requires reqwest to pull the entire body. If the solver connection drops or times out mid-upload, ChannelWriter::blocking_send errors → serialization aborts early → GzipCapture is dropped without finalize() → the oneshot sender is dropped → the archive is silently skipped.

Net effect: auctions whose solver request fails partway through transmission no longer get archived — which may be exactly the auctions you'd want to inspect later. Is dropping the archive in that case intended? If not, consider decoupling the archive from the request stream (e.g. tee into an independent buffer rather than gating finalize on full request consumption).

Secondary note: the spawn_blocking thread is now held for the full duration of the body upload (it blocks on blocking_send until reqwest drains each chunk), rather than being released right after producing the buffer. Per-solve that's a blocking-pool thread occupied for the whole solver round-trip.

I solved this with the TeeWriter and the BestEffortSink, which will allow the TeeWriter to continue uploading to S3 when this scenarion happens

gemini-code-assist

Code Review

This pull request introduces streaming serialization and compression for auction payloads to optimize memory usage. Instead of buffering the entire uncompressed JSON in memory, the request body is streamed to the solver while a gzipped copy is captured on the fly for S3 archival. Feedback on these changes highlights the need to elevate the log level for serialization and flushing errors from debug to warn/error, and to make the TeeWriter::write implementation more robust by properly handling partial writes instead of assuming full writes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

MartinquaXD

LGTM overall. When I briefly looked at the metrics I remember this incurring some non-trivial CPU overhead. Could you please provide updated metrics for this?
Ideally from some network with reasonably sized auctions. Given that this was already e2e tested thoroughly one should probably just run it in prod for a little bit to get the more accurate and relevant numbers for this.

Streaming JSON

ed8210a

jmg-duarte force-pushed the jmgd/stream-solve-body branch from 3377af2 to ed8210a Compare June 29, 2026 15:16

MartinquaXD reviewed Jun 29, 2026

View reviewed changes

jmg-duarte and others added 6 commits June 29, 2026 18:50

minimize comments

1eecefa

rollback the bytestream change

6e061c6

finalize for tee

5bdd45f

clean up

f478b5f

jmg-duarte marked this pull request as ready for review June 30, 2026 11:05

jmg-duarte requested a review from a team as a code owner June 30, 2026 11:05

claude Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread crates/driver/src/infra/solver/streaming/mod.rs

gemini-code-assist Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread crates/driver/src/infra/solver/streaming/mod.rs

Comment thread crates/driver/src/infra/solver/streaming/mod.rs

Comment thread crates/driver/src/infra/solver/streaming/tee_writer.rs

jmg-duarte added 3 commits June 30, 2026 12:31

fmt

b4aef68

BestEffortSink to swallow errors, avoiding tee from failing

779134f

Handle serialize_request metrics

760cd19

MartinquaXD approved these changes Jul 1, 2026

View reviewed changes

Comment thread crates/driver/Cargo.toml Outdated

Comment thread crates/driver/src/infra/solver/streaming/mod.rs

Comment thread crates/driver/src/infra/solver/mod.rs Outdated

jmg-duarte added 2 commits July 1, 2026 11:09

Merge branch 'main' into jmgd/stream-solve-body

dc888a3

address comments

dead209

Uh oh!

Conversation

jmg-duarte commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

How to test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MartinquaXD Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

jmg-duarte Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

claude Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Stream the auction JSON body to S3 + solvers

Uh oh!

claude Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

jmg-duarte Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

jmg-duarte Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MartinquaXD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jmg-duarte commented Jun 29, 2026 •

edited

Loading

claude Bot commented Jun 30, 2026 •

edited

Loading