feat(solana-indexer) PR 5.1: ingester drain loop by tilacog · Pull Request #4549 · cowprotocol/services

tilacog · 2026-06-22T19:07:08Z

Description

This PR fills in Ingester::run, which was previously just unimplemented!().

The ingester now pulls updates from an AutoReconnect-backed GeyserStream and pushes tagged StreamUpdates into the decoder channel. Slot filter messages advance the latest chain slot counter in memory.

Mind that the ingester does not decode messages and does not write the watermark; the decoder handles persistence.

Ingester::serve is the production entrypoint. It builds the subscription request, resumes from the stored watermark (from_slot = watermark + 1, or the live tip on a cold start), opens the stream through GeyserGrpcClient::subscribe_with_request, and runs the drain loop. The caller passes in a GeyserGrpcClient built with set_reconnect_config; otherwise the auto-reconnect wrapper will not actually reconnect.

Reconnects, backoff, keepalive, and resume from checkpoint are handled by the AutoReconnect stream wrapper. That wrapper also injects its own BlockMeta + slot filter under the __autoreconnect key so it can checkpoint and resume; those messages stay inside the wrapper and never reach the ingester. Recoverable errors are swallowed there too. run only returns when the stream ends for good or the decoder receiver drops.

Ping/Pong frames are dropped. The library passes them through, but the ingester has no use for them.

Unit tests for the run loop are defined in this follow-up PR: #4550.

Changes

Implemented Ingester::run as a plain drain loop. It reads updates from the stream, dispatches them, and stops when the stream ends or the decoder channel closes. No reconnect or backoff logic inside the ingester.
Made the ingester generic over S: Stream<Item = Result<SubscribeUpdate, Status>> + Unpin + Send instead of tying it to a GrpcConnector. Production uses an AutoReconnect-backed GeyserStream; tests can pass any stream. new takes the stream, the decoder sender, and a shared Arc<AtomicU64> for the latest chain slot.
Added Ingester::serve, a production entrypoint generic over St: Store. It builds the SubscribeRequest, reads the watermark to set from_slot, opens the stream, and runs the drain loop. Added an Error enum for setup failures, terminal stream errors, and clean stream end.
Added subscribe_request, which defines the four program filters (settlement and solflow transactions and accounts, failed transactions included) plus a chain_tip slot filter at confirmed commitment. from_slot is left empty so serve can fill it from the watermark.
Added handle_update and its helpers handle_transaction, handle_account, and handle_slot. Transactions and accounts are forwarded as StreamUpdate::Tx / StreamUpdate::Account. Frames without a body or with malformed signatures are skipped with a warning. Slot messages only update latest_chain_slot in memory.
Added forward, which tries a non-blocking send into the decoder channel, falls back to a blocking send with a warning when the channel is full, and stops when the receiver is gone.
Replaced the module-level LATEST_CHAIN_SLOT: AtomicU64 static with an Arc<AtomicU64> owned by the Ingester. The watchdog and finalization worker will take read clones. Updated related doc comments in watchdog.rs, commitment.rs, and errors.rs.
Refactored Store to require Send + Sync and return impl Future<Output = Result<..., StoreError>> + Send from each method. This lets Ingester::serve be tokio::spawned while still allowing implementors to write async fn bodies.
Re-exported the yellowstone geyser types needed by the ingester and serve in types/wire.rs.
Added futures as a dependency for StreamExt.

How to test

Implementation only; run-loop unit tests follow in the next PR.

cargo check -p solana-indexer
cargo clippy -p solana-indexer --all-targets
cargo +nightly fmt --all -- --check

This is a follow-up PR to #4514

Fills in the ingester run loop for the solana-indexer.

gemini-code-assist

Code Review

This pull request implements the Ingester component to drain the Yellowstone gRPC stream and forward transaction and account updates to the decoder. It also refactors the Store trait to support thread-safe async operations. Feedback on the changes suggests using fetch_max instead of store on the atomic slot counter to prevent regression from out-of-order updates, and rate-limiting the backpressure warning log to avoid log flooding when the decoder channel is full.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

… slot messages

Patches remote memory exhaustion DoS in quinn-proto via unbounded out-of-order stream reassembly. Transitive dep via solana-client.

MartinquaXD · 2026-06-23T11:24:07Z

+                }
+            }
+        }
+        tracing::info!("yellowstone stream ended; ingester stopping");


Given that the architecture heavily leans towards actors did you already think about graceful shutdowns?

Yes, and it is covered in the Notion page plan as one of the later steps.

MartinquaXD · 2026-06-23T11:24:24Z

+        latest_chain_slot: &AtomicU64,
+        update: SubscribeUpdate,
+    ) -> ControlFlow<()> {
+        use UpdateOneof::*;


Let's keep all imports on the top of the file.

MartinquaXD · 2026-06-23T11:25:55Z

+    /// Associated function taking the channel and chain-tip counter by
+    /// reference rather than `&self`, so the future borrows only those (both
+    /// `Sync`) fields across awaits. That keeps `run`'s future `Send` without
+    /// requiring `Ingester: Sync` — the `GeyserStream` field is `Send` but not
+    /// `Sync`.


IMO this is not a doc comment. Doc comments should explain the effects of a function without overwhelming the reader with implementation details. If impl details are important they should be added as regular code comments in the code sections they relate to.

MartinquaXD · 2026-06-23T11:27:54Z

+            Transaction(tx_msg) => Self::handle_transaction(tx, tx_msg).await,
+            Account(account) => Self::handle_account(tx, account).await,
+            Slot(slot) => Self::handle_slot(latest_chain_slot, slot).await,


This serializes the handling of each packet which can be slow depending on the futures. Is this intended? Do we not have to be worried about creating unnecessary back pressure here?

I guess, intended. The per-message work here is just a channel send, which is fast. All the slow work (decode, DB writes) lives behind the channel in the decoder, so this loop stays quick.

cc @tilacog

MartinquaXD · 2026-06-23T11:29:18Z

+        let Some(inner) = tx_msg.transaction else {
+            tracing::warn!(slot = tx_msg.slot, "transaction update without a body");
+            return ControlFlow::Continue(());
+        };


Is this something that can actually happen or is transaction unnecessarily optional?

The yellowstone proto types transaction as optional, so it can be absent on an empty or malformed frame. We skip that case defensively rather than assume it is always present.

MartinquaXD · 2026-06-23T11:31:54Z

+        };
+        let Ok(signature) = Signature::try_from(inner.signature.as_slice()) else {
+            tracing::warn!(
+                slot = tx_msg.slot,


It's not too bad yet in this code but ideally slot should be passed via a tracing span and .instrument(). That way you don't have to remember and manually add the slot to every related log.

Done. handle_transaction and handle_account now carry a slot span via #[instrument], so the warns inside no longer thread slot through manually.

MartinquaXD · 2026-06-23T11:34:42Z

+    /// The persisted watermark could not be read.
+    #[error("failed to read the resume watermark: {0}")]
+    Store(#[from] StoreError),


The names of the error variants are too generic. Why not incorporate the additional context of the log messages into the names? Otherwise the reader will always have to jump to the error definition to understand anything about the error.

Each variant wraps a distinct source error via #[from] (e.g. StoreError, GeyserGrpcClientError, Status), and the #[error("…")] message spells out the context, so the name plus the message read clearly where it surfaces. Did I miss anything?

MartinquaXD · 2026-06-23T11:38:37Z

+        let request = SubscribeRequest {
+            from_slot,
+            ..request
+        };


What's the reason to not move this and the from_slot logic into subscribe_request()?

Makes sense. Moved.

MartinquaXD · 2026-06-23T11:40:06Z

+        ingester.run().await?;
+        Ok(())


nit: you can just return ingester.run().await, no?

MartinquaXD · 2026-06-23T11:43:28Z

 //! Re-exports of the `yellowstone-grpc-proto` message types the indexer
 //! consumes as its wire-format surface.
 pub use yellowstone_grpc_proto::{


What's the reasoning for re-exporting those types btw?

Probably, two reasons: shorter imports, and one place to fix if the upstream crate moves those types.

@tilacog ?

MartinquaXD · 2026-06-23T11:52:59Z

+    /// Consume a slot message: advance the in-memory chain-tip counter. Slot
+    /// messages never enter the channel, so this always continues.
+    async fn handle_slot(
+        latest_chain_slot: &AtomicU64,
+        slot: SubscribeUpdateSlot,
+    ) -> ControlFlow<()> {
+        latest_chain_slot.fetch_max(slot.slot, Ordering::Relaxed);
+        ControlFlow::Continue(())
+    }


Do we also have to update the last_chain_slot when we encounter the other message types? They also contain a slot number after all.

Probably, no? IIUC, the slot filter already sends one message per slot, so the tip advances on every slot. Tx and account updates only fire for the two programs we subscribe to (settlement and SolFlow), so their slots are always ones the slot filter already covered - they'd add nothing. (It's fetch_max, so bumping from them wouldn't be wrong, just a wasted atomic on the hot path) Added a doc line on handle_slot explaining this.

cc @tilacog

squadgazzz

Will need to do another round.

squadgazzz · 2026-06-24T13:45:21Z

+                tracing::warn!("decoder channel full; ingester blocked on backpressure");
+                match tx.send(update).await {
+                    Ok(()) => ControlFlow::Continue(()),
+                    Err(_) => ControlFlow::Break(()),
+                }


Will the failed update itself be logged somewhere? I don't really see it.

squadgazzz · 2026-06-24T13:53:02Z

+            // Ping/Pong frames carry no data the ingester needs; the library passes them through,
+            // and we drop them here.
+            Ping(_) | Pong(_) => ControlFlow::Continue(()),


As I understand yellowstone, the server sends periodic Ping frames and expects a Pong back on the request stream, or it can drop an idle connection. Here Ping/Pong are ignored and serve drops, so nothing answers. The PR says AutoReconnect handles keepalive, but the linked reconnect.rs looks like it forwards pings to us without answering them. If that's right, the connection only stays up on HTTP/2 keepalive set on the GeyserGrpcClient.

Replying to myself 😂

We can't set it in this PR: keepalive is a builder option on the tonic endpoint, set before .connect(), and serve receives an already-built GeyserGrpcClient. There's also no live connection here - serve has no spawn site yet (the is_send helper never runs) and the tests use mock streams, so there's no idle socket to drop. The keepalive has to go where the client is built and spawned, which is the wiring PR. I documented the contract on serve. If we'd rather not rely on a doc note, the wiring PR can add a build_client helper that bakes in keepalive + reconnect config so the call site can't forget.

…ootstrap Resolve conflicts from PR4's async-trait + Arc<dyn> base: - store.rs: keep async_trait Store, drop PR5.1's impl-Future refactor (async_trait already yields Send-boxed futures) - ingester.rs: keep PR5.1's drain-loop impl; make Ingester and Error pub(crate) to match the pub(crate) domain types; wrap wire u64 slots in Slot for StreamUpdate - Cargo.toml: keep futures (PR5.1) plus async-trait/dashmap/derive_more (PR4), drop observe/prometheus - errors.rs: keep the detailed ReplayWindowExceeded message

- hoist the UpdateOneof import out of handle_update and qualify the match arms - demote the handle_update Send-workaround note from a doc comment to a regular comment - warn on a present-but-malformed account txn_signature instead of dropping it silently - log the latest chain slot on payload-less updates - simplify serve to return ingester.run().await directly

- instrument handle_transaction/handle_account with a slot span instead of threading slot into each log - subscribe_request takes from_slot as a param, dropping the post-build patch in serve - document that serve's client needs HTTP/2 keepalive since the ingester does not answer pings, and drop the wrong claim that the wrapper handles keepalive

…-bootstrap

…-bootstrap # Conflicts: # crates/solana-indexer/src/indexer/ingester.rs

feat(solana-indexer): PR 5 — ingester drain loop

e8fd659

Fills in the ingester run loop for the solana-indexer.

tilacog requested a review from a team as a code owner June 22, 2026 19:07

gemini-code-assist Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread crates/solana-indexer/src/indexer/ingester.rs

Comment thread crates/solana-indexer/src/indexer/ingester.rs

tilacog added 2 commits June 22, 2026 16:34

fix(solana-indexer): keep latest chain slot monotonic on out-of-order…

78ab02a

… slot messages

docs(solana-indexer): add TODO about rate-limiting backpressure warning

7466381

tilacog mentioned this pull request Jun 22, 2026

tests(solana-indexer) PR 5.2: Add unit tests for solana-indexer ingester component #4550

Open

fix: bump quinn-proto to 0.11.15 (RUSTSEC-2026-0185)

301c61b

Patches remote memory exhaustion DoS in quinn-proto via unbounded out-of-order stream reassembly. Transitive dep via solana-client.

tilacog changed the title ~~feat(solana-indexer): PR 5 — ingester drain loop~~ feat(solana-indexer): PR 5.1 — ingester drain loop Jun 23, 2026

tilacog changed the title ~~feat(solana-indexer): PR 5.1 — ingester drain loop~~ feat(solana-indexer) PR 5.1: ingester drain loop Jun 23, 2026

MartinquaXD reviewed Jun 23, 2026

View reviewed changes

squadgazzz reviewed Jun 24, 2026

View reviewed changes

squadgazzz mentioned this pull request Jun 26, 2026

feat(solana-indexer): PR 4 — Component structs (skeleton declarations) #4514

Open

tilacog mentioned this pull request Jun 26, 2026

feat(solana-indexer): PR 3 — add traits module #4508

Open

squadgazzz added 5 commits June 29, 2026 10:40

Merge branch 'solana-indexer/PR4-bootstrap' into solana-indexer/PR5.1…

7abf502

…-bootstrap

Merge branch 'solana-indexer/PR4-bootstrap' into solana-indexer/PR5.1…

73ae508

…-bootstrap # Conflicts: # crates/solana-indexer/src/indexer/ingester.rs

Merge branch 'solana-indexer/PR4-bootstrap' into solana-indexer/PR5.1…

3d0e737

…-bootstrap # Conflicts: # crates/solana-indexer/src/indexer/ingester.rs

Uh oh!

Conversation

tilacog commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

How to test

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squadgazzz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tilacog commented Jun 22, 2026 •

edited

Loading