MINOR: Preserve empty list offsets during split transfers#31
Draft
telemenar wants to merge 7 commits into
Draft
Conversation
This reverts commit e293989.
splitAndTransfer should always return a valid allocated vector. A vector with no entries still needs, by spec, a value of 0 in its offsetBuffer. A list vector with a zero-capacity offsetBuffer is therefore not valid. This moves the empty-offset repair closer to where the invalid state is introduced by ensuring zero-length ListVector and LargeListVector split transfers materialize the required offset entry. Nested zero-length list transfers get the same treatment through the child transfer pair. One grey area remains: getFieldBuffers() triggering allocation for an otherwise unallocated vector is useful as a last-line guard for serialization/export, but it is not the cleanest owner of the allocation invariant.
4c70b45 to
0ab8ab1
Compare
This comment has been minimized.
This comment has been minimized.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
This is stacked on top of #30 and extends the same empty-offset-buffer handling into zero-length
splitAndTransfer()paths forListVectorandLargeListVector.For zero-length splits, the target vector now materializes the required empty offset entry and recursively invokes the child transfer pair with a zero-length range so nested list vectors get the same invariant treatment.
Why
splitAndTransfer()should always return a valid allocated vector. A vector with no entries still needs, by spec, a value of0in itsoffsetBuffer; a list vector with a zero-capacityoffsetBufferis not valid.PR 30 is still useful as serialization/export hardening, but this moves the repair closer to where the invalid empty-vector state can be introduced.
One grey area remains:
getFieldBuffers()triggering allocation for an otherwise unallocated vector is useful as a last-line guard, but it is not the cleanest owner of the allocation invariant.Testing
mvn -pl vector -Dmaven.gitcommitid.skip=true -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TestListVector,TestLargeListVector test mvn -pl vector spotless:applyTargeted tests passed under both Netty and Unsafe allocator executions.