feat!: migrate bitmap to index segment based#6869
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
westonpace
left a comment
There was a problem hiding this comment.
I have some questions, I probably just need to get up to speed on these concepts.
| return self._ds.create_index( | ||
| [column], index_type, name, replace, train, None, kwargs | ||
| ) |
There was a problem hiding this comment.
Should you document what is being returned?
| IndexConfig(index_type="bitmap", parameters={}) | ||
| if isinstance(index_type, str) and index_type.upper() == "BITMAP" |
There was a problem hiding this comment.
I'm a little confused. If index_type == "BITMAP" you are switching to IndexConfig here (with no parameters) and then calling create_scalar_index but in create_scalar_index you also look for index_type == "BITMAP" but how would that ever be true?
| let mut frag_ids = RoaringBitmap::new(); | ||
| for key in self.index_map.keys() { | ||
| let bitmap = self.load_bitmap(key, None).await?; | ||
| frag_ids.extend(bitmap.iter().map(|(fragment_id, _)| *fragment_id)); | ||
| } | ||
| frag_ids.extend(self.null_map.iter().map(|(fragment_id, _)| *fragment_id)); | ||
| Ok(frag_ids) |
There was a problem hiding this comment.
Why implement this method? I thought it was only a utility method for an old migration before we had fragment bitmaps?
| raise NotImplementedError( | ||
| "Scalar indices currently only support a single column" | ||
| ) | ||
| return self.create_scalar_index( |
There was a problem hiding this comment.
The method is called create_index_uncommitted but if you call create_scalar_index isn't it going to commit the index?
256c793 to
eb9868d
Compare
|
Hi @westonpace, thanks for the review! I think the main source of confusion is that the current status of distributed bitmap building doesn't align with our index segments design. For now, the bitmap has special logic around distributed index building. It includes a concept called I have rewritten the PR migrating the index segment API. This will also make this PR a breaking change one. |
5525253 to
63b4b0e
Compare
63b4b0e to
04de91a
Compare
Adds Bitmap support to the existing segment-based distributed index workflow.
Callers can now build staged Bitmap roots with
create_index_uncommitted(..., index_type="BITMAP", fragment_ids=...), finalize them throughcreate_index_segment_builder().with_index_type("BITMAP").with_segments(...).build_all(), and publish them withcommit_existing_index_segments(...).For Bitmap,
execute_uncommittednow writes canonicalbitmap_page_lookup.lancesegment roots directly. The old public Python Bitmap shard workflow throughcreate_scalar_index(..., fragment_ids=...)andmerge_index_metadata(..., "BITMAP")is no longer exposed; callers should use the segment workflow instead.Relates to OSS-971 and OSS-972.