GH-508: Finalize Variant and shredding specs#509
Merged
wgtmac merged 1 commit intoapache:masterfrom Aug 25, 2025
Merged
Conversation
Fokko
approved these changes
Aug 20, 2025
alamb
approved these changes
Aug 20, 2025
Contributor
alamb
left a comment
There was a problem hiding this comment.
From the Rust perspective (https://github.com/apache/arrow-rs) this works well
Thank you
wgtmac
approved these changes
Aug 21, 2025
Member
wgtmac
left a comment
There was a problem hiding this comment.
The progress looks good. We can merge this once the vote passes.
Contributor
Author
Member
|
Thanks @aihuaxu for driving this! |
jiayuasu
pushed a commit
to jiayuasu/parquet-format
that referenced
this pull request
Oct 6, 2025
|
Does Variant support top-array-struct ? |
Contributor
Author
@amorynan You mean nested variant such as ARRAY(variant), MAP<String, Variant>, right? I think we need some work there. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
Per Parquet's requirements, we need at least two reference implementations to finalize the Variant logical type specification and here are the current status:
Java already has the encoding and shredding implementations in place:
apache/parquet-java#3197
apache/parquet-java#3202
apache/parquet-java#3223
apache/parquet-java#3211
Go also includes encoding and shredding support:
apache/arrow-go#344
apache/arrow-go#434
We also have validated cross-language verification between Java and GO languages (apache/arrow-go#455 and apache/parquet-java#3258).
Rust is currently working on the shredding implementation. In addition to these, we already have a full Variant implementation in Apache Iceberg, as well as in some closed-source engines. At this point, I think we have enough implementation coverage to move forward with finalizing the Variant spec.
What changes are included in this PR?
Remove "under active development" notes from the doc.
Do these changes have PoC implementations?
Closes #508