feat(transaction): RowDelta merge-on-read deletion vectors (carries forward #2203)#2678
Open
raghav-reglobe wants to merge 15 commits into
Open
feat(transaction): RowDelta merge-on-read deletion vectors (carries forward #2203)#2678raghav-reglobe wants to merge 15 commits into
raghav-reglobe wants to merge 15 commits into
Conversation
5486140 to
c9f97a6
Compare
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
The caching delete-file loader now recognizes a deletion vector (PositionDeletes task whose file_format is Puffin), reads the deletion-vector-v1 blob via PuffinReader, and parses it with DeleteVector::from_puffin_blob into the delete filter — completing the V3 DV read path (the DelVecs -> upsert_delete_vector output side was already present). With this, iceberg-rust correctly excludes rows marked by a V3 deletion vector on read; previously such rows were returned. Read complement to the DV-write in apache#2678. Part 3 of DV-applied scan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…fications This commit implements the core transaction infrastructure for MERGE INTO, UPDATE, and DELETE operations in Apache Iceberg-Rust. Based on the official Iceberg Java implementation (RowDelta API). **New file: `crates/iceberg/src/transaction/row_delta.rs`** - RowDeltaAction: Transaction action supporting both data file additions and deletions in a single snapshot - add_data_files(): Add new data files (inserts/rewrites in COW mode) - remove_data_files(): Mark data files as deleted (COW mode) - add_delete_files(): Reserved for future Merge-on-Read (MOR) support - validate_from_snapshot(): Conflict detection for concurrent modifications - RowDeltaOperation: Implements SnapshotProduceOperation trait - Determines operation type (Append/Delete/Overwrite) based on changes - Generates DELETED manifest entries for removed files - Carries forward existing manifests for unchanged data **Modified: `crates/iceberg/src/transaction/mod.rs`** - Add row_delta() method to Transaction API - Export row_delta module **Modified: `crates/iceberg/src/transaction/snapshot.rs`** - Add write_delete_manifest() to write DELETED manifest entries - Update manifest_file() to process delete entries from SnapshotProduceOperation - Update validation to allow delete-only operations Comprehensive unit tests with ~85% coverage: - test_row_delta_add_only: Pure append operation - test_row_delta_remove_only: Delete-only operation - test_row_delta_add_and_remove: COW update (remove old, add new) - test_row_delta_with_snapshot_properties: Custom snapshot properties - test_row_delta_validate_from_snapshot: Snapshot validation logic - test_row_delta_empty_action: Empty operation error handling - test_row_delta_incompatible_partition_value: Partition validation All existing tests pass (1135 passed; 0 failed). Copy-on-Write (COW) Strategy: - For row-level modifications: read target files, apply changes, write new files, mark old files deleted - For inserts: write new data files - Merge-on-Read (MOR) with delete files is reserved for future optimization References: - Java implementation: org.apache.iceberg.RowDelta, BaseRowDelta - Based on implementation plan for MERGE INTO support
- Fix test_row_delta_validate_from_snapshot to assert ErrorKind::DataInvalid directly rather than matching against error message strings - Correct operation() doc comment: remove inaccurate "Only adds delete files → Delete" bullet; add explicit note that Operation::Delete is deferred until MoR is wired up - Add comment explaining why removed_data_files are not validated (already-committed files, matches Java MergingSnapshotProducer behavior) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
unwrap_err() requires T: Debug on the Ok type (ActionCommit), which is not derived. Use a match instead to extract and assert the error kind. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…eview Blocking issues fixed: - existing_manifest() now rewrites manifests that contain deleted files instead of dropping them entirely. Surviving files get EXISTING entries (original sequence numbers preserved), removed files get DELETED entries (snapshot_id updated to current, sequence numbers preserved). Matches Java ManifestFilterManager.filterManifestWithDeletedFiles behavior. - DELETED manifest entries now carry original sequence numbers (copied from the loaded manifest entry), fixing the spec violation where 0 was used as a placeholder. - Snapshot summary now tracks removed files via SnapshotSummaryCollector.remove_file(), populating deleted-data-files, deleted-records, and removed-files-size metrics. - add_delete_files() now returns ErrorKind::FeatureUnsupported immediately on commit instead of silently dropping the files. Design improvements: - Renamed write_delete_manifest → write_manifest_with_deleted_entries to distinguish data manifests with DELETED-status entries from Iceberg delete manifests (content=Deletes, used for MoR delete files). - SnapshotProduceOperation::existing_manifest now takes &mut SnapshotProducer so implementations can call new_manifest_writer() for rewrites. - Added removed_data_files() default method to the trait for summary tracking. - Removed added_delete_files from RowDeltaOperation (only needed for the fail-fast check in RowDeltaAction::commit). - Trimmed struct/method doc comments to match codebase convention. Tests: - Added test_row_delta_cow_manifest_rewrite: FastAppend 2 files, RowDelta remove one + add one, then verify DELETED/EXISTING/ADDED entries and sequence numbers in the resulting manifests. - Added test_row_delta_add_delete_files_errors for the fail-fast path. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Implements DeleteVector::to_puffin_blob / from_puffin_blob for the Iceberg deletion-vector-v1 Puffin blob format: 4-byte BE length, magic 0xD1D33964, portable 64-bit RoaringTreemap, 4-byte BE CRC32. Adds the crc32fast dependency. Tests: self serialize<->deserialize round-trip; pyiceberg-portable layout validation (Spark/Iceberg-Java byte compatibility); full Puffin-file round-trip via PuffinWriter/PuffinReader. Serializer design referenced from risingwavelabs/iceberg-rust apache#113. Toward finishing RowDelta merge-on-read DV-write (PR apache#2203). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…es manifest) SnapshotProducer now carries added_delete_files and writes a separate content=Deletes manifest (build_v2_deletes / build_v3_deletes). RowDeltaAction.add_delete_files commits MoR delete files (position/equality, incl. V3 deletion vectors) instead of returning FeatureUnsupported; operation() returns Delete (only deletes) or Overwrite (with data files) per Java BaseRowDelta semantics. Replaces the stub-error test with test_row_delta_add_delete_files_mor (commits a position-delete file; asserts Operation::Delete + a PositionDeletes manifest entry). All 45 transaction tests pass (fast_append / overwrite / CoW unregressed). Toward finishing RowDelta merge-on-read DV-write (PR apache#2203). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
Adds PuffinWriter::close_with_metadata() + PuffinWriteResult (file size + per-blob offsets/lengths). DeleteVector::write_to_puffin_file() writes a deletion-vector-v1 Puffin file and returns the V3 DataFile{content=PositionDeletes, referenced_data_file, content_offset, content_size_in_bytes} ready to feed RowDeltaAction::add_delete_files.
Test test_dv_write_to_puffin_file (DV -> Puffin file -> DataFile, read back). Completes the end-to-end Rust MoR DV-write path: DeleteVector -> Puffin file -> DataFile -> RowDelta content=Deletes manifest commit. cf. RW deletion_vector_writer.rs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…ation example Exposes 'pub mod delete_vector' (+ rustdoc on the public surface to satisfy #![deny(missing_docs)]) so downstream crates can use DeleteVector + write_to_puffin_file. Adds crates/catalog/rest/examples/pulse_dv_realdata.rs: connects to a REST catalog (Polaris), loads a real V3 table, writes a deletion vector via write_to_puffin_file, commits via RowDelta (content=Deletes manifest) — for cross-engine (Doris/Spark) validation before opening PR apache#2203. Compiles clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…mple RestCatalog requires explicit StorageFactory injection (S3 storage moved to the iceberg-storage-opendal crate, apache#2207). Wire OpenDalResolvingStorageFactory into the pulse_dv_realdata example and dev-dep the crate. Verified end-to-end against a real V3 Iceberg table on S3 (Polaris REST catalog): the Rust harness wrote a deletion-vector-v1 and committed via RowDelta; Doris (independent engine) read COUNT(*) 10 -> 7 with positions 0,1,2 (ids 1,2,3) deleted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…ayload
Adds testdata/puffin/deletion-vector-v1-payload.bin (Java-produced deletion-vector-v1 payload for positions {1,3,5,7,9}, from apache/iceberg test resources via apache/iceberg-go's fixture) and a test asserting DeleteVector::to_puffin_blob produces byte-identical output. Proves our roaring serialization + length/magic/CRC framing match the Iceberg-Java reference exactly.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…rg-go) Mirrors apache/iceberg-go TestSerializeDVEmpty + TestSerializeDVLargePositions: empty DV round-trips; positions straddling the 2^31 (Java-signed) and 2^32 (roaring bucket) boundaries round-trip. 12 delete_vector tests now pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
…mall+large) From apache/iceberg core test resources: empty-position-index.bin + small-and-large-values-position-index.bin. Now 3 Java byte-identical DV-payload vectors (empty; alternating 1/3/5/7/9; small+large across two 16-bit roaring containers). 14 delete_vector tests pass. all-container-types fixture intentionally NOT byte-pinned: Java run-optimizes containers, roaring-rs does not — spec-valid but different bytes (round-trip still covered). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
Apache iceberg-rust CI gates on rustfmt + clippy; both now clean on the changed crates (iceberg, iceberg-catalog-rest). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
…eleteFile Groundwork for V3 deletion-vector-applied scan: the delete-file scan task now carries the manifest entry's file format (Puffin marks a DV) and referenced_data_file, so the delete-file loader can distinguish a Puffin DV from a Parquet positional/equality delete and know which data file it applies to. Populated in From<&DeleteFileContext>; test constructors default to Parquet/None. Part 1 of DV-applied scan (complements the DV-write in apache#2678). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
The caching delete-file loader now recognizes a deletion vector (PositionDeletes task whose file_format is Puffin), reads the deletion-vector-v1 blob via PuffinReader, and parses it with DeleteVector::from_puffin_blob into the delete filter — completing the V3 DV read path (the DelVecs -> upsert_delete_vector output side was already present). With this, iceberg-rust correctly excludes rows marked by a V3 deletion vector on read; previously such rows were returned. Read complement to the DV-write in apache#2678. Part 3 of DV-applied scan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
c9f97a6 to
a8bc543
Compare
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
…eleteFile Groundwork for V3 deletion-vector-applied scan: the delete-file scan task now carries the manifest entry's file format (Puffin marks a DV) and referenced_data_file, so the delete-file loader can distinguish a Puffin DV from a Parquet positional/equality delete and know which data file it applies to. Populated in From<&DeleteFileContext>; test constructors default to Parquet/None. Part 1 of DV-applied scan (complements the DV-write in apache#2678). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
The caching delete-file loader now recognizes a deletion vector (PositionDeletes task whose file_format is Puffin), reads the deletion-vector-v1 blob via PuffinReader, and parses it with DeleteVector::from_puffin_blob into the delete filter — completing the V3 DV read path (the DelVecs -> upsert_delete_vector output side was already present). With this, iceberg-rust correctly excludes rows marked by a V3 deletion vector on read; previously such rows were returned. Read complement to the DV-write in apache#2678. Part 3 of DV-applied scan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
- adapt pulse_dv_realdata example to manifest_list_reader (load_manifest_list was removed from Snapshot in current main) - allowlist 'mor' (merge-on-read) in .typos.toml - regenerate crates/iceberg/public-api.txt and DEPENDENCIES.rust.tsv files Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0ff7ccd to
207aca3
Compare
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
…eleteFile Groundwork for V3 deletion-vector-applied scan: the delete-file scan task now carries the manifest entry's file format (Puffin marks a DV) and referenced_data_file, so the delete-file loader can distinguish a Puffin DV from a Parquet positional/equality delete and know which data file it applies to. Populated in From<&DeleteFileContext>; test constructors default to Parquet/None. Part 1 of DV-applied scan (complements the DV-write in apache#2678). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
raghav-reglobe
added a commit
to raghav-reglobe/iceberg-rust
that referenced
this pull request
Jun 20, 2026
The caching delete-file loader now recognizes a deletion vector (PositionDeletes task whose file_format is Puffin), reads the deletion-vector-v1 blob via PuffinReader, and parses it with DeleteVector::from_puffin_blob into the delete filter — completing the V3 DV read path (the DelVecs -> upsert_delete_vector output side was already present). With this, iceberg-rust correctly excludes rows marked by a V3 deletion vector on read; previously such rows were returned. Read complement to the DV-write in apache#2678. Part 3 of DV-applied scan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Raghvendra Singh <raghav@cashify.in>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes
RowDeltawith the merge-on-read (MoR) path — V3 Puffin deletion vectors committed via acontent=Deletesmanifest — on top of @wirybeaver's copy-on-writeRowDeltafoundation from #2203 (rebased onto currentmain, authorship preserved). Together they giveRowDeltaboth COW (rewrite data files) and MoR (deletion vectors) — the basis forMERGE INTO/UPDATE/DELETEon V3 tables.#2203 has been stale and conflicting with
main; per my note there, this carries it forward. The rebase conflicts were trivial (additivemod/usekeep-both) plus one moved API (Snapshot::load_manifest_list→Table::manifest_list_reader).MoR additions (on top of #2203)
DeleteVector::{to_puffin_blob, from_puffin_blob, write_to_puffin_file}— serialize/deserialize thedeletion-vector-v1blob (magic0xD1D33964, 4-byte BE length + portable 64-bit roaring bitmap + 4-byte BE CRC32;snapshot-id/sequence-number=-1;fields=[]) and produce thePositionDeletesDataFile.RowDeltaAction::add_delete_filesMoR commit — writes acontent=Deletesmanifest withOperation::Delete(replaces the previousFeatureUnsupportedstub).delete_vectormadepub(with rustdoc).Validation
to_puffin_bloboutput equals the Java-produced golden DV fixtures fromapache/icebergcore test resources, for 3 vectors: empty,{1, 3, 5, 7, 9}, and small+large spanning two roaring containers (crates/iceberg/testdata/puffin/*-position-index.bin).apache/iceberg-go(puffin/dv_golden_test.go,TestSerializeDV*).delete_vector+ 7transaction::row_deltatests pass;cargo fmt+cargo clippyclean.Credit & follow-ups
RowDeltacopy-on-write foundation by @wirybeaver (feat(transaction): add RowDelta transaction action for row-level modi… #2203) — carried forward here with authorship preserved.DVWriteraccumulator (multi-blob-per-Puffin, dedup, partition capture) as in iceberg-go; reader-side DV invariant enforcement (snapshot-id/sequence-number= -1 on read).