Skip to content

HDDS-9154. Design doc for snapshot diff optimization.#10335

Open
SaketaChalamchala wants to merge 3 commits into
apache:masterfrom
SaketaChalamchala:HDDS-9154
Open

HDDS-9154. Design doc for snapshot diff optimization.#10335
SaketaChalamchala wants to merge 3 commits into
apache:masterfrom
SaketaChalamchala:HDDS-9154

Conversation

@SaketaChalamchala
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This is a design doc for an optimized snapshot diff that relies on mostly sequential reads of the snapshot DBs and gets the diff directly from the delta files instead of the current implementation that does a lot of cross database random lookups.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9154

How was this patch tested?

N/A

@SaketaChalamchala SaketaChalamchala added snapshot https://issues.apache.org/jira/browse/HDDS-6517 design labels May 22, 2026
@SaketaChalamchala SaketaChalamchala marked this pull request as ready for review May 26, 2026 22:23
Comment on lines +92 to +94
### 2.3. UpdateID Gating
**Baseline Issue:** The baseline performs full object comparisons including timestamps to detect modifications, which is susceptible to clock skew and is computationally expensive.
**Optimized Design:** Uses the `dbTxSequenceNumber` of the `fromSnapshot` as a strict gate. During the `toSnapshot` scan, entries are only considered candidates for diff if their `updateID > fromSnapshotDbTxSequenceNumber`.
Copy link
Copy Markdown
Contributor

@smengcl smengcl May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateId comes from Ratis trxnLogIndex , dbTxSequenceNumber is RocksDB concept. Pls double check if those are comparable?

Copy link
Copy Markdown
Contributor Author

@SaketaChalamchala SaketaChalamchala May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out @smengcl. updateID and dbTxSequenceNumber are not comparable. It is more likely that dbTxSequenceNumber faster than updateID and we may miss some valid diff entries with this comparison updateID > fromSnapshotDbTxSequenceNumber.
A more suitable gate can be updateID > fromSnapshot's lastTransactionInfo.txnIndex. This may only need to be an optimization on the full diff path.
For DAG based diff, we can read the entry's sequence number using the raw SST iterators and compare with fromSnapshotDbTxSequenceNumber. This will always guarantee correctness regardless of any updateID bugs.
What do you think?

@SaketaChalamchala SaketaChalamchala requested a review from smengcl May 27, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

design snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants