Skip to content

Add V3 manifest/manifest-list writing and row-lineage snapshot commits#3070

Open
SaymV wants to merge 2 commits intoapache:mainfrom
SaymV:codex/v3-manifest-writing
Open

Add V3 manifest/manifest-list writing and row-lineage snapshot commits#3070
SaymV wants to merge 2 commits intoapache:mainfrom
SaymV:codex/v3-manifest-writing

Conversation

@SaymV
Copy link
Contributor

@SaymV SaymV commented Feb 20, 2026

Summary

  • enable writing table metadata for format v3 and raise the supported format ceiling to v3
  • add ManifestWriterV3 and ManifestListWriterV3, including v3 manifest-list first_row_id assignment
  • wire snapshot commits to populate v3 first_row_id and added_rows from manifest-list row-id assignment
  • allow transaction format upgrades to v3 and initialize next_row_id=0 for new/upgraded v3 tables
  • update unit and integration tests to assert v3 write success (including nanosecond schema and row-lineage assertions)

Implementation notes

  • DEFAULT_READ_VERSION in manifest handling is now 3 so v3 fields (including ManifestFile.first_row_id) are preserved when reading/re-writing manifests.
  • v3 manifest-list row-id assignment follows Java behavior: assign only for data manifests missing first_row_id, and advance by existing_rows_count + added_rows_count.

Follow-up scope (tracked by #1551)

  • This PR intentionally targets the v3 core writer path for append/delete/overwrite commits.
  • Remaining v3-related scope appears broader, including equality-delete execution support, deletion-vector execution semantics, and wider catalog/runtime parity coverage (for example, Hive paths).
  • If maintainers prefer, I can add part of this missing scope to this PR; otherwise I can open a focused follow-up PR linked to #1551.

Validation

  • ./.venv/bin/python -m pytest -q tests/table/test_metadata.py tests/table/test_init.py tests/utils/test_manifest.py
  • uv run python -m pytest -q tests/integration/test_reads.py -k upgrade_table_version
  • uv run python -m pytest -q tests/integration/test_writes/test_writes.py -k "test_nanosecond_support_on_catalog or test_v3_write_and_read_row_lineage"

Related: #1551

@SaymV SaymV marked this pull request as draft February 20, 2026 00:26
@SaymV
Copy link
Contributor Author

SaymV commented Feb 20, 2026

Linking this PR to the long-running V3 writer tracker: #1551 ("Support writing V3 tables").

This change is the core manifest/manifest-list + metadata/snapshot writer path that unblocks practical V3 table writes (including geospatial-driven V3 usage), while still deferring broader full V3 parity work.

@SaymV SaymV marked this pull request as ready for review February 20, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments