Skip to content

feat!: migrate bitmap to index segment based#6869

Open
Xuanwo wants to merge 5 commits into
mainfrom
xuanwo/bitmap-segment-build-path
Open

feat!: migrate bitmap to index segment based#6869
Xuanwo wants to merge 5 commits into
mainfrom
xuanwo/bitmap-segment-build-path

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented May 20, 2026

Adds Bitmap support to the existing segment-based distributed index workflow.

Callers can now build staged Bitmap roots with create_index_uncommitted(..., index_type="BITMAP", fragment_ids=...), finalize them through create_index_segment_builder().with_index_type("BITMAP").with_segments(...).build_all(), and publish them with commit_existing_index_segments(...).

For Bitmap, execute_uncommitted now writes canonical bitmap_page_lookup.lance segment roots directly. The old public Python Bitmap shard workflow through create_scalar_index(..., fragment_ids=...) and merge_index_metadata(..., "BITMAP") is no longer exposed; callers should use the segment workflow instead.

Relates to OSS-971 and OSS-972.

@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@Xuanwo Xuanwo changed the title Support canonical Bitmap index segments feat: support canonical Bitmap index segments May 20, 2026
@github-actions github-actions Bot added the enhancement New feature or request label May 20, 2026
@Xuanwo Xuanwo marked this pull request as ready for review May 20, 2026 11:16
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 81.76471% with 31 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index/scalar/bitmap.rs 62.31% 23 Missing and 3 partials ⚠️
rust/lance/src/index/create.rs 77.77% 4 Missing ⚠️
rust/lance/src/index/scalar_logical.rs 98.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions, I probably just need to get up to speed on these concepts.

Comment thread python/python/lance/dataset.py Outdated
Comment on lines +3265 to +3267
return self._ds.create_index(
[column], index_type, name, replace, train, None, kwargs
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you document what is being returned?

Comment thread python/python/lance/dataset.py Outdated
Comment on lines +3966 to +3967
IndexConfig(index_type="bitmap", parameters={})
if isinstance(index_type, str) and index_type.upper() == "BITMAP"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused. If index_type == "BITMAP" you are switching to IndexConfig here (with no parameters) and then calling create_scalar_index but in create_scalar_index you also look for index_type == "BITMAP" but how would that ever be true?

Comment thread rust/lance-index/src/scalar/bitmap.rs Outdated
Comment on lines +450 to +456
let mut frag_ids = RoaringBitmap::new();
for key in self.index_map.keys() {
let bitmap = self.load_bitmap(key, None).await?;
frag_ids.extend(bitmap.iter().map(|(fragment_id, _)| *fragment_id));
}
frag_ids.extend(self.null_map.iter().map(|(fragment_id, _)| *fragment_id));
Ok(frag_ids)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why implement this method? I thought it was only a utility method for an old migration before we had fragment bitmaps?

Comment thread python/python/lance/dataset.py Outdated
raise NotImplementedError(
"Scalar indices currently only support a single column"
)
return self.create_scalar_index(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method is called create_index_uncommitted but if you call create_scalar_index isn't it going to commit the index?

@Xuanwo Xuanwo force-pushed the xuanwo/bitmap-segment-build-path branch from 256c793 to eb9868d Compare May 21, 2026 07:03
@Xuanwo Xuanwo changed the title feat: support canonical Bitmap index segments feat: support canonical bitmap segment builds May 21, 2026
@Xuanwo
Copy link
Copy Markdown
Collaborator Author

Xuanwo commented May 21, 2026

Hi @westonpace, thanks for the review! I think the main source of confusion is that the current status of distributed bitmap building doesn't align with our index segments design.

For now, the bitmap has special logic around distributed index building. It includes a concept called shard_id and has its own pipeline. My plan is to migrate the entire pipeline to be index segment based instead.

I have rewritten the PR migrating the index segment API. This will also make this PR a breaking change one.

@Xuanwo Xuanwo changed the title feat: support canonical bitmap segment builds feat: support bitmap index segment builds May 21, 2026
@Xuanwo Xuanwo force-pushed the xuanwo/bitmap-segment-build-path branch from 5525253 to 63b4b0e Compare May 21, 2026 07:40
@Xuanwo Xuanwo changed the title feat: support bitmap index segment builds feat: support bitmap index segment builder May 21, 2026
@Xuanwo Xuanwo force-pushed the xuanwo/bitmap-segment-build-path branch from 63b4b0e to 04de91a Compare May 21, 2026 08:12
@Xuanwo Xuanwo changed the title feat: support bitmap index segment builder feat: support bitmap index segments May 21, 2026
@Xuanwo Xuanwo changed the title feat: support bitmap index segments feat!: migrate bitmap to index segment based May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants