From f6293362b19ab0dcf60e77ff102a621c29ee6ca8 Mon Sep 17 00:00:00 2001 From: docs-agent Date: Fri, 15 May 2026 17:27:55 +0000 Subject: [PATCH 1/3] [docs-agent] Add Solana historical account state snapshots guide New MDX page at content/api-reference/solana/historical-account-state.mdx covering the two supported workflows for reconstructing point-in-time Solana account state: forward snapshotting via Yellowstone gRPC subscriptions and historical backfill via getSignaturesForAddress + getTransaction replay. Includes guidance on sampling cadence, filter options, gRPC reconnect via from_slot replay, and a trade-offs comparison. Added under the Solana section in docs.yml alongside AccountsDB Infrastructure. Refs DOCS-84 Requested-by: @SahilAujla --- .../solana/historical-account-state.mdx | 202 ++++++++++++++++++ content/docs.yml | 2 + 2 files changed, 204 insertions(+) create mode 100644 content/api-reference/solana/historical-account-state.mdx diff --git a/content/api-reference/solana/historical-account-state.mdx b/content/api-reference/solana/historical-account-state.mdx new file mode 100644 index 000000000..b0e4e60ec --- /dev/null +++ b/content/api-reference/solana/historical-account-state.mdx @@ -0,0 +1,202 @@ +--- +title: Snapshotting historical Solana account state +description: Patterns for capturing Solana account state at regular intervals (hourly, daily, or arbitrary slot cadence) using Alchemy's Yellowstone gRPC streams and archival JSON-RPC methods. +subtitle: Capture point-in-time Solana account state at any cadence using gRPC subscriptions and archival JSON-RPC +slug: docs/solana/historical-account-state +--- + +Many Solana workloads need to know what an account looked like at a previous point in time: portfolio analytics, TVL history, vault accounting, oracle audits, treasury reporting, ML feature stores. Standard Solana JSON-RPC does not expose a "read this account as of slot N" primitive. This page covers the two patterns Alchemy supports today for reconstructing historical account state, how to combine them, and how to land at common cadences like hourly snapshots. + +## Why Solana RPC alone is not enough + +`getAccountInfo` returns the **current** state of an account. The `minContextSlot` parameter is a freshness floor (the read must be evaluated at a slot greater than or equal to `minContextSlot`), not a historical lookup. Solana validators do not retain prior versions of an account's data once the account is updated, so no JSON-RPC method can answer "what did this account hold at slot N" directly. + +What Solana **does** retain is the full transaction and block history. Alchemy's archival infrastructure exposes that history through `getTransaction`, `getBlock`, and `getSignaturesForAddress`, which means you can reconstruct historical account state by: + +1. **Capturing it forward** as the chain advances, persisting periodic snapshots to your own store. +2. **Reconstructing it backward** by replaying the transactions that wrote to the account. + +Most production indexers combine both: stream forward from "now" while backfilling history for the period before the stream started. + +## Workflow A: forward snapshotting via Yellowstone gRPC + +[Yellowstone gRPC](/docs/reference/yellowstone-grpc-overview) is the recommended path for ongoing state capture. A single subscription receives every account update for the addresses or programs you care about, with the slot number attached to each update, and persists at minimal latency. + +### How it works + +1. Open a Yellowstone gRPC subscription with an account filter ([`accounts`](/docs/reference/yellowstone-grpc-subscribe-accounts) in the [`SubscribeRequest`](/docs/reference/yellowstone-grpc-subscribe-request)). +2. On every `SubscribeUpdateAccount` your handler receives, write a row containing `(pubkey, slot, write_version, data, lamports, owner, txn_signature)` to your store. +3. Index the resulting table by `(pubkey, slot)` so you can query the latest version at or before any target slot. + +To produce hourly snapshots, you do not need a separate sampling job. Solana blocks land at roughly 400 ms, so an active account may have many writes per hour and a quiet account may have none. Both cases are handled the same way: a query for "the state of this account at hour H" becomes "the row with the largest `slot` that is less than or equal to the slot corresponding to the timestamp at the end of hour H". Use `getBlockTime` or `getBlocks` to map wall-clock timestamps to slots. + +### Filter options + +The Yellowstone account filter supports three styles, in order of selectivity: + +* **Specific addresses** ([`account`](/docs/reference/yellowstone-grpc-subscribe-accounts#account-address-filter)): pass a list of pubkeys to watch. Best when you know the accounts upfront. +* **Program owner** ([`owner`](/docs/reference/yellowstone-grpc-subscribe-accounts#owner-filter)): receive every account owned by a given program. Use this when the account set is dynamic (for example, all vault accounts under a single program). +* **Memcmp / data-size / lamports / `token_account_state`** ([`filters`](/docs/reference/yellowstone-grpc-subscribe-accounts#memcmp-filter)): narrow further by byte patterns, account discriminator, or balance ranges. Combine with `owner` to watch a specific subset of a program's accounts. + +If you need to enumerate the initial set of accounts before subscribing (for example, all program-derived accounts under a vault program), use `getProgramAccounts` paginated via Alchemy's [AccountsDB Infrastructure](/docs/solana/accounts-db-infra). The `pageKey` and `order` parameters let you scan large account sets without timing out. + +### Minimal Rust example + +```rust +use anyhow::Result; +use futures::{sink::SinkExt, stream::StreamExt}; +use std::collections::HashMap; +use yellowstone_grpc_client::{ClientTlsConfig, GeyserGrpcClient}; +use yellowstone_grpc_proto::geyser::{ + CommitmentLevel, SubscribeRequest, SubscribeRequestFilterAccounts, + subscribe_update::UpdateOneof, +}; + +#[tokio::main] +async fn main() -> Result<()> { + let endpoint = "https://solana-mainnet.g.alchemy.com"; + let x_token = "YOUR_ALCHEMY_API_KEY"; + + let mut client = GeyserGrpcClient::build_from_shared(endpoint)? + .tls_config(ClientTlsConfig::new().with_native_roots())? + .x_token(Some(x_token))? + .connect() + .await?; + + let (mut tx, mut stream) = client.subscribe().await?; + + // Subscribe to a specific set of account pubkeys + let mut accounts = HashMap::new(); + accounts.insert( + "accounts_to_snapshot".to_string(), + SubscribeRequestFilterAccounts { + account: vec![ + "AccountPubkey1...".to_string(), + "AccountPubkey2...".to_string(), + ], + owner: vec![], + filters: vec![], + nonempty_txn_signature: Some(true), + }, + ); + + tx.send(SubscribeRequest { + accounts, + commitment: Some(CommitmentLevel::Confirmed as i32), + ..Default::default() + }) + .await?; + + while let Some(Ok(msg)) = stream.next().await { + if let Some(UpdateOneof::Account(update)) = msg.update_oneof { + if let Some(info) = update.account { + // Persist this snapshot. Index by (pubkey, slot, write_version). + println!( + "slot={} pubkey={} lamports={} data_len={}", + update.slot, + bs58::encode(&info.pubkey).into_string(), + info.lamports, + info.data.len() + ); + } + } + } + + Ok(()) +} +``` + +For client setup, authentication details, and language samples in TypeScript and Go, see the [Yellowstone gRPC Quickstart](/docs/reference/yellowstone-grpc-quickstart). + +### Recovering from disconnects + +Yellowstone supports replaying historical updates by setting `from_slot` on the `SubscribeRequest`. The replay window is up to **6000 slots (~40 minutes)** of history. Persist the slot of the last update you successfully wrote, and on reconnect resubscribe with `from_slot` set to that slot. If your downtime exceeds the replay window, fall back to the backfill workflow below for the gap. + +## Workflow B: historical backfill via transaction replay + +Snapshotting forward only captures state from the moment your subscription starts. To answer questions about earlier periods, replay the transactions that touched the account. + +### Pattern + +1. Call [`getSignaturesForAddress`](/docs/chains/solana/solana-api-endpoints/get-signatures-for-address) for the target account, paging through history with `before` and `until` cursors. Returns signatures plus block times. +2. For each signature, call [`getTransaction`](/docs/chains/solana/solana-api-endpoints/get-transaction) to fetch the full transaction, pre/post balances, token balances, and inner instructions. +3. Apply each transaction's effect on the account to your local replica, in slot order. Stop and write a snapshot whenever the next transaction crosses your target sampling boundary (the next hour, the next day, or any other cadence). + +Alchemy's Solana archival surface is optimized for this kind of historical scan. Per the [Built for Solana](https://www.alchemy.com/blog/solana-infrastructure) release, `getTransaction` is up to 20x faster than other providers on historical calls, and `getSignaturesForAddress` supports recency-first ordering so you can walk from the present backward without scanning from genesis. + +### When you need program-account state, not just a single account + +If you are snapshotting state across an entire program (for example, every position account under a lending program), do the initial enumeration with paginated [`getProgramAccounts`](/docs/chains/solana/solana-api-endpoints/get-program-accounts) calls. See [AccountsDB Infrastructure](/docs/solana/accounts-db-infra) for the `pageKey` + `order` pattern. Once you have the address set, you can either backfill each address with workflow B, or subscribe to the program owner via workflow A to capture future changes. + +### Minimal TypeScript example + +```typescript +import { Connection, PublicKey } from "@solana/web3.js"; + +const connection = new Connection( + "https://solana-mainnet.g.alchemy.com/v2/YOUR_ALCHEMY_API_KEY", + "confirmed" +); + +async function walkAccountHistory( + address: string, + untilSlot: number +): Promise { + const pubkey = new PublicKey(address); + let before: string | undefined = undefined; + + while (true) { + const sigs = await connection.getSignaturesForAddress(pubkey, { + before, + limit: 1000, + }); + if (sigs.length === 0) break; + + for (const sig of sigs) { + if (sig.slot < untilSlot) return; + + const tx = await connection.getTransaction(sig.signature, { + maxSupportedTransactionVersion: 0, + }); + if (!tx) continue; + + // Apply the transaction's effect on the target account here. + // For lamports-only changes, use tx.meta.preBalances/postBalances. + // For program-state changes, decode the account data using your IDL. + console.log(`slot=${sig.slot} signature=${sig.signature}`); + } + + before = sigs[sigs.length - 1].signature; + } +} +``` + +## Choosing a sampling cadence + +The two workflows above give you per-update granularity. Picking a coarser cadence (hourly, daily) is a query-time concern, not an ingest-time one. Two common approaches: + +* **Store every update, derive snapshots at query time.** Recommended when the account set is small or update volume is moderate. Lets you change cadence later without re-ingesting. +* **Materialize fixed-cadence rollups.** Run a periodic job that, for each tracked account, writes the latest value at or before each cadence boundary into a separate table. Reduces query cost when you only ever read at fixed intervals. + +To map wall-clock timestamps to slots for the boundaries, call [`getBlockTime`](/docs/chains/solana/solana-api-endpoints/get-block-time) on a candidate slot, or use [`getBlocks`](/docs/chains/solana/solana-api-endpoints/get-blocks) to find the slot range that bounds a target timestamp. + +## Workflow comparison + +| Aspect | Yellowstone gRPC (Workflow A) | Transaction replay (Workflow B) | +| ---------------------- | ---------------------------------------------------- | -------------------------------------------------------------- | +| Time horizon | From subscription start, forward | From any point in history, backward | +| Latency to new data | Real-time (sub-second) | As fast as you can page archival history | +| Filter granularity | Address, owner, memcmp, data size, lamports | Per-address (via `getSignaturesForAddress`) | +| Reconstruction effort | None — receive full account bytes per update | Decode each transaction's effect on the account using your IDL | +| Replay after gap | `from_slot`, up to 6000 slots (~40 min) | Unbounded, limited only by archival depth | +| Best for | Ongoing capture of a known account or program set | Backfilling history before your stream started | +| Plan requirement | PAYG or Enterprise | Available on all paid plans | + +Combine the two: turn on the gRPC subscription first, persist its slot as your "snapshot start", then run the backfill from your business epoch up to that slot. Once the backfill completes, your store has continuous coverage. + +## Related references + +* [AccountsDB Infrastructure](/docs/solana/accounts-db-infra) — paginated `getProgramAccounts` and `getTokenLargestAccounts` for enumerating large account sets. +* [Yellowstone gRPC Overview](/docs/reference/yellowstone-grpc-overview) and [Subscribe to Accounts](/docs/reference/yellowstone-grpc-subscribe-accounts) — full filter reference and protobuf definitions. +* [Solana API FAQ — Historical Data (Archival)](/docs/solana-api-faq#historical-data-archival) — the full set of archival JSON-RPC methods. +* [Built for Solana](https://www.alchemy.com/blog/solana-infrastructure) and [How Alchemy Built the Fastest Archival Methods on Solana](https://www.alchemy.com/blog/how-alchemy-built-the-fastest-archival-methods-on-solana) — background on the archival stack powering the methods above. diff --git a/content/docs.yml b/content/docs.yml index c66c6c85e..04cc4e1dc 100644 --- a/content/docs.yml +++ b/content/docs.yml @@ -1276,6 +1276,8 @@ navigation: href: https://solana-demo-sigma.vercel.app/ - page: Accounts DB Infrastructure path: api-reference/solana/accounts-db-infra.mdx + - page: Historical account state + path: api-reference/solana/historical-account-state.mdx - section: Tutorials contents: - link: Hello World Solana Application From bfdf540b07bf0db996cfc33f6e09a723f9aac413 Mon Sep 17 00:00:00 2001 From: docs-agent Date: Fri, 15 May 2026 18:06:44 +0000 Subject: [PATCH 2/3] [docs-agent] Use (slot, write_version) for snapshot ordering Addresses codex review feedback on PR #1303: a single Solana slot can contain multiple writes to the same account, so indexing by (pubkey, slot) alone makes 'latest version at or before slot N' ambiguous and can return an earlier intra-slot state. Update the storage layout and the SQL example to use (pubkey, slot, write_version) for indexing and ORDER BY slot DESC, write_version DESC LIMIT 1 for the snapshot lookup. Add a paragraph explaining what write_version is and why the tie-break matters. Refs DOCS-84 Requested-by: @SahilAujla --- content/api-reference/solana/historical-account-state.mdx | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/api-reference/solana/historical-account-state.mdx b/content/api-reference/solana/historical-account-state.mdx index b0e4e60ec..930c50306 100644 --- a/content/api-reference/solana/historical-account-state.mdx +++ b/content/api-reference/solana/historical-account-state.mdx @@ -26,9 +26,11 @@ Most production indexers combine both: stream forward from "now" while backfilli 1. Open a Yellowstone gRPC subscription with an account filter ([`accounts`](/docs/reference/yellowstone-grpc-subscribe-accounts) in the [`SubscribeRequest`](/docs/reference/yellowstone-grpc-subscribe-request)). 2. On every `SubscribeUpdateAccount` your handler receives, write a row containing `(pubkey, slot, write_version, data, lamports, owner, txn_signature)` to your store. -3. Index the resulting table by `(pubkey, slot)` so you can query the latest version at or before any target slot. +3. Index the resulting table by `(pubkey, slot, write_version)` so you can query the latest version at or before any target slot. -To produce hourly snapshots, you do not need a separate sampling job. Solana blocks land at roughly 400 ms, so an active account may have many writes per hour and a quiet account may have none. Both cases are handled the same way: a query for "the state of this account at hour H" becomes "the row with the largest `slot` that is less than or equal to the slot corresponding to the timestamp at the end of hour H". Use `getBlockTime` or `getBlocks` to map wall-clock timestamps to slots. +A single Solana slot can contain multiple transactions that write to the same account. The `write_version` field is a monotonically increasing counter that disambiguates those intra-slot updates: the final state for a slot is the row with the highest `write_version` at that slot. Always tie-break by `(slot, write_version)` rather than `slot` alone. + +To produce hourly snapshots, you do not need a separate sampling job. Solana blocks land at roughly 400 ms, so an active account may have many writes per hour and a quiet account may have none. Both cases are handled the same way: a query for "the state of this account at hour H" returns the row with the largest `(slot, write_version)` pair where `slot` is less than or equal to the slot corresponding to the timestamp at the end of hour H. In SQL that is `ORDER BY slot DESC, write_version DESC LIMIT 1`. Use `getBlockTime` or `getBlocks` to map wall-clock timestamps to slots. ### Filter options From 2f5bd3adeaba11772f187e74154c58c750c45825 Mon Sep 17 00:00:00 2001 From: docs-agent Date: Fri, 15 May 2026 18:38:58 +0000 Subject: [PATCH 3/3] [docs-agent] Tighten workflow B scope to balances + log events MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses review feedback from @deepakbnsl on PR #1303. The original workflow B description and TypeScript example implied that getTransaction is sufficient to reconstruct arbitrary program account data, which is incorrect — getTransaction returns pre/post lamport balances, pre/post token balances, inner instructions, and log messages, but NOT pre/post account data blobs. Changes: * New 'What getTransaction provides' subsection enumerates what the response actually contains and explicitly notes the absence of pre/post account data fields. * New Warning callout scoping workflow B to SOL lamport balances, SPL token balances, and program-emitted log events. Documents that arbitrary program account data reconstruction requires either a starting data snapshot + program-specific instruction decoding, or comprehensive event logs from the program. * Rewrote the Pattern subsection to be concrete about reading meta.postBalances at the target account index, and clarified that meta.preBalances at T+1 equals meta.postBalances at the most recent earlier touching transaction so the post-side is the canonical value per slot. * Rewrote 'When you need program-account state, not just a single account' as 'When you need program-account state, not just balances' with the correct guidance: enumerate via getProgramAccounts for current data, start workflow A for ongoing changes, treat the subscription start as the history-begins-here marker. External-snapshot bootstrap or program-side replay flagged as out-of-scope for this guide. * Replaced the TypeScript example with a concrete lamport-balance backfill that locates the target account's index in transaction.message.accountKeys and persists meta.postBalances[i]. Added a note about extending to token balances and log events. * Updated the workflow comparison table: 'Reconstruction effort' row replaced with 'Data captured' to clearly distinguish what each workflow returns (full account bytes vs balances + log events only). * Updated the closing 'combine the two' paragraph to reflect the correct combined workflow. Refs DOCS-84 Requested-by: @SahilAujla --- .../solana/historical-account-state.mdx | 103 +++++++++++++----- 1 file changed, 78 insertions(+), 25 deletions(-) diff --git a/content/api-reference/solana/historical-account-state.mdx b/content/api-reference/solana/historical-account-state.mdx index 930c50306..0b5cc3f85 100644 --- a/content/api-reference/solana/historical-account-state.mdx +++ b/content/api-reference/solana/historical-account-state.mdx @@ -116,21 +116,49 @@ Yellowstone supports replaying historical updates by setting `from_slot` on the ## Workflow B: historical backfill via transaction replay -Snapshotting forward only captures state from the moment your subscription starts. To answer questions about earlier periods, replay the transactions that touched the account. +Snapshotting forward only captures state from the moment your subscription starts. For periods before that, the available primitive is the transaction history exposed through `getTransaction`. This workflow is best understood by what each `getTransaction` response actually contains and what it does not. + +### What `getTransaction` provides + +For each transaction, `getTransaction` returns: + +* `meta.preBalances` and `meta.postBalances`: SOL lamport balances per account (indexed against `transaction.message.accountKeys`). +* `meta.preTokenBalances` and `meta.postTokenBalances`: SPL token balances per account. +* `meta.innerInstructions`: CPIs invoked by the top-level instructions. +* `meta.logMessages`: program log output. +* `transaction.message.instructions` and `transaction.message.accountKeys`: the instructions and the accounts they touched. + +What `getTransaction` does **not** return: pre or post account `data` blobs. There is no `preAccountData` / `postAccountData` field. That is the central constraint on this workflow. + + + Transaction replay can reliably reconstruct three things: **SOL lamport balances** (from pre/post balances), **SPL token balances** (from pre/post token balances), and **events emitted via program logs** (Anchor `emit!` macros, custom `msg!` lines). + + It **cannot** reconstruct arbitrary program account `data` blobs from `getTransaction` alone. To rebuild full account `data` for historical periods, you need either (a) a starting `data` snapshot from before the period of interest, plus program-specific decoding of each touching instruction to derive the new state, or (b) the program emitting comprehensive event logs that cover every state mutation. If neither applies, the supported path is to start workflow A as early as possible and accept that history before the subscription start is not recoverable from transactions alone. + ### Pattern 1. Call [`getSignaturesForAddress`](/docs/chains/solana/solana-api-endpoints/get-signatures-for-address) for the target account, paging through history with `before` and `until` cursors. Returns signatures plus block times. -2. For each signature, call [`getTransaction`](/docs/chains/solana/solana-api-endpoints/get-transaction) to fetch the full transaction, pre/post balances, token balances, and inner instructions. -3. Apply each transaction's effect on the account to your local replica, in slot order. Stop and write a snapshot whenever the next transaction crosses your target sampling boundary (the next hour, the next day, or any other cadence). +2. For each signature, call [`getTransaction`](/docs/chains/solana/solana-api-endpoints/get-transaction) and locate the target account's index in `transaction.message.accountKeys`. +3. Read `meta.postBalances[i]` (and `meta.postTokenBalances` for token accounts) for the state immediately after this transaction's writes. Persist a row keyed by `(pubkey, slot, signature)` so you can later answer "balance at hour H" as the row with the largest slot less than or equal to `block_at(H)`. + +`meta.preBalances` and `meta.postBalances` are consistent across adjacent transactions: the `preBalances[i]` of the transaction at slot T+1 equals the `postBalances[i]` of the most recent earlier transaction that touched account `i`. Use the post-side of each transaction as the canonical value for its slot. Alchemy's Solana archival surface is optimized for this kind of historical scan. Per the [Built for Solana](https://www.alchemy.com/blog/solana-infrastructure) release, `getTransaction` is up to 20x faster than other providers on historical calls, and `getSignaturesForAddress` supports recency-first ordering so you can walk from the present backward without scanning from genesis. -### When you need program-account state, not just a single account +### When you need program-account state, not just balances + +For lending position values, vault share prices, oracle values, market state, and any other field that lives inside a program account's `data` blob, workflow B is not sufficient on its own (see the warning above). The supported path is: -If you are snapshotting state across an entire program (for example, every position account under a lending program), do the initial enumeration with paginated [`getProgramAccounts`](/docs/chains/solana/solana-api-endpoints/get-program-accounts) calls. See [AccountsDB Infrastructure](/docs/solana/accounts-db-infra) for the `pageKey` + `order` pattern. Once you have the address set, you can either backfill each address with workflow B, or subscribe to the program owner via workflow A to capture future changes. +1. Enumerate the account set with paginated [`getProgramAccounts`](/docs/chains/solana/solana-api-endpoints/get-program-accounts) calls using Alchemy's [AccountsDB Infrastructure](/docs/solana/accounts-db-infra) (`pageKey` + `order`). This gives you every account and its **current** `data`. +2. Start a Yellowstone gRPC subscription (workflow A) filtered by program owner. From that point forward, every state change is captured with the full new `data` bytes. +3. Treat the moment the subscription starts as your "history begins here" marker. Reads at any slot at or after that marker resolve against the snapshot table; reads before it are not supported by this workflow. -### Minimal TypeScript example +The earlier you start workflow A, the smaller the unsupported window. If a backfill before the subscription start is required, the practical options are to bootstrap from an external historical-snapshot source for the program or to model the program's instructions program-side and replay them against a starting snapshot. Both are out of scope for this guide. + +### Minimal TypeScript example: lamport balance backfill + +This example walks one account's signature history and persists post-balance per transaction, which is what workflow B reliably supports. ```typescript import { Connection, PublicKey } from "@solana/web3.js"; @@ -140,11 +168,19 @@ const connection = new Connection( "confirmed" ); -async function walkAccountHistory( +type BalanceRow = { + pubkey: string; + slot: number; + signature: string; + lamports: number; +}; + +async function backfillLamportHistory( address: string, untilSlot: number -): Promise { +): Promise { const pubkey = new PublicKey(address); + const rows: BalanceRow[] = []; let before: string | undefined = undefined; while (true) { @@ -155,24 +191,41 @@ async function walkAccountHistory( if (sigs.length === 0) break; for (const sig of sigs) { - if (sig.slot < untilSlot) return; + if (sig.slot < untilSlot) return rows; const tx = await connection.getTransaction(sig.signature, { maxSupportedTransactionVersion: 0, }); - if (!tx) continue; + if (!tx || !tx.meta) continue; - // Apply the transaction's effect on the target account here. - // For lamports-only changes, use tx.meta.preBalances/postBalances. - // For program-state changes, decode the account data using your IDL. - console.log(`slot=${sig.slot} signature=${sig.signature}`); + // Locate the target account's index in the message account keys. + const keys = tx.transaction.message.getAccountKeys({ + accountKeysFromLookups: tx.meta.loadedAddresses, + }); + const index = keys + .keySegments() + .flat() + .findIndex((k) => k.equals(pubkey)); + if (index === -1) continue; + + // Post-balance is the canonical lamport value after this transaction's writes. + rows.push({ + pubkey: address, + slot: sig.slot, + signature: sig.signature, + lamports: tx.meta.postBalances[index], + }); } before = sigs[sigs.length - 1].signature; } + + return rows; } ``` +For SPL token balances, swap `meta.postBalances[index]` for the matching entry in `meta.postTokenBalances` (which is indexed by `accountIndex` and includes `mint`, `owner`, and `uiTokenAmount`). For program log events, parse `meta.logMessages` against your program's known event format. + ## Choosing a sampling cadence The two workflows above give you per-update granularity. Picking a coarser cadence (hourly, daily) is a query-time concern, not an ingest-time one. Two common approaches: @@ -184,17 +237,17 @@ To map wall-clock timestamps to slots for the boundaries, call [`getBlockTime`]( ## Workflow comparison -| Aspect | Yellowstone gRPC (Workflow A) | Transaction replay (Workflow B) | -| ---------------------- | ---------------------------------------------------- | -------------------------------------------------------------- | -| Time horizon | From subscription start, forward | From any point in history, backward | -| Latency to new data | Real-time (sub-second) | As fast as you can page archival history | -| Filter granularity | Address, owner, memcmp, data size, lamports | Per-address (via `getSignaturesForAddress`) | -| Reconstruction effort | None — receive full account bytes per update | Decode each transaction's effect on the account using your IDL | -| Replay after gap | `from_slot`, up to 6000 slots (~40 min) | Unbounded, limited only by archival depth | -| Best for | Ongoing capture of a known account or program set | Backfilling history before your stream started | -| Plan requirement | PAYG or Enterprise | Available on all paid plans | - -Combine the two: turn on the gRPC subscription first, persist its slot as your "snapshot start", then run the backfill from your business epoch up to that slot. Once the backfill completes, your store has continuous coverage. +| Aspect | Yellowstone gRPC (Workflow A) | Transaction replay (Workflow B) | +| --------------------- | ------------------------------------------------- | ------------------------------------------------------------------------------------- | +| Time horizon | From subscription start, forward | From any point in history, backward | +| Latency to new data | Real-time (sub-second) | As fast as you can page archival history | +| Filter granularity | Address, owner, memcmp, data size, lamports | Per-address (via `getSignaturesForAddress`) | +| Data captured | Full account bytes (data, lamports, owner) | SOL balance, SPL token balance, program log events only (no account `data` blobs) | +| Replay after gap | `from_slot`, up to 6000 slots (~40 min) | Unbounded, limited only by archival depth | +| Best for | Ongoing capture of a known account or program set | Backfilling SOL/token balance history; not sufficient for arbitrary program state | +| Plan requirement | PAYG or Enterprise | Available on all paid plans | + +Combine the two: turn on the gRPC subscription first, persist its slot as your "snapshot start", then run workflow B in parallel to backfill SOL and SPL token balance history before that slot. For full account `data` history before the subscription start, see the warning under [Workflow B](#workflow-b-historical-backfill-via-transaction-replay): start workflow A as early as possible, or source a starting snapshot from an external historical-data product. ## Related references