Skip to content

feat(java): add allowExternalBlobOutsideBases to WriteParams#6330

Open
beinan wants to merge 1 commit intolance-format:mainfrom
beinan:feat/java-allow-external-blob
Open

feat(java): add allowExternalBlobOutsideBases to WriteParams#6330
beinan wants to merge 1 commit intolance-format:mainfrom
beinan:feat/java-allow-external-blob

Conversation

@beinan
Copy link
Copy Markdown
Contributor

@beinan beinan commented Mar 30, 2026

Summary

Add support for writing blob v2 columns with external URI references that are outside registered base paths. This enables use cases like INSERT INTO SELECT across Lance tables where the target table stores external blob references pointing to the source table's blob files instead of copying the actual blob bytes.

Changes

  • WriteParams.java: Add allowExternalBlobOutsideBases Optional field, getter, and builder method
  • Fragment.java: Pass the new field through createWithFfiArray and createWithFfiStream native methods
  • fragment.rs (JNI): Thread the new Optional<Boolean> parameter through all fragment creation functions to extract_write_params
  • utils.rs (JNI): Parse the new parameter and set allow_external_blob_outside_bases on Rust WriteParams
  • blocking_dataset.rs (JNI): Pass JObject::null() for the new param in Dataset.write() path (not needed there)

Context

This is a prerequisite for lance-spark blob JOIN support (lance-format/lance-spark#355). When blob data flows through Spark's shuffle during JOIN + INSERT INTO, the target table needs to write external blob references pointing to the source table's physical blob files. The Rust BlobPreprocessor already supports this via allow_external_blob_outside_bases, but the Java SDK had no way to set it.

Ref: #6321, #6322

Test plan

  • Rust JNI code compiles cleanly (no errors in changed files)
  • Java unit tests (CI)

🤖 Generated with Claude Code

Add support for writing blob v2 columns with external URI references
that are outside registered base paths. This enables use cases like
INSERT INTO SELECT across Lance tables where the target table stores
external blob references pointing to the source table's blob files.

Changes:
- WriteParams: add allowExternalBlobOutsideBases field + builder method
- Fragment JNI: thread the new parameter through createWithFfiArray,
  createWithFfiStream, and create_fragment to Rust WriteParams
- extract_write_params: parse the new Optional<Boolean> and set
  allow_external_blob_outside_bases on Rust WriteParams

Ref: lance-format#6321
Ref: lance-format#6322

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, pending CI fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants