feat: support rangebitmap read and write#185
Conversation
| const Literal& literal) { | ||
| return std::make_shared<BitmapIndexResult>( | ||
| [self = shared_from_this(), literal]() -> Result<RoaringBitmap32> { | ||
| if (!self->range_bitmap_) { |
There was a problem hiding this comment.
I'm a bit curious — are there any cases where self->range_bitmap_ could be null?
From the current logic, it seems like it should always be initialized as long as no exception occurs during setup.
There was a problem hiding this comment.
You're right, the constructor of paimon::RangeBitmapFileIndexReader::RangeBitmapFileIndexReader was public before, I changed it to private method later. So the defensive check is unneccessary.
| PAIMON_ASSIGN_OR_RAISE(int32_t min_compare, key.CompareTo(min_)); | ||
| PAIMON_ASSIGN_OR_RAISE(int32_t max_compare, key.CompareTo(max_)); | ||
| PAIMON_ASSIGN_OR_RAISE(BitSliceIndexBitmap * bit_slice_ptr, this->GetBitSliceIndex()); | ||
| if (min_compare == 0 && max_compare == 0) { |
There was a problem hiding this comment.
Maybe BitSliceIndexBitmap* bit_slice_ptr?
| return RoaringBitmap32(); | ||
| } | ||
| PAIMON_ASSIGN_OR_RAISE(Dictionary * dictionary, this->GetDictionary()); | ||
| PAIMON_ASSIGN_OR_RAISE(int32_t code, dictionary->Find(key)); |
There was a problem hiding this comment.
Minor style note: for consistency, could we place the pointer * next to the type instead of the variable?
There was a problem hiding this comment.
The pre-commit seems to be a little problem. I wrote Dictionary* dictionary and run pre-commit run -a, the pre-commit changes it to Dictionary * dictionary. I'll modify it and see if it pass the CI.
There was a problem hiding this comment.
The clang-format failed. I guess it might be related to the PAIMON_ASSIGN_OR_RAISE macro, clang-format cannot derive if Dictionary* dictionary is wether a varialble declaration or expressoin.
I checked other places in paimon-cpp, src/paimon/core/io/row_to_arrow_array_converter.h:136, auto seems to be the solution to this.
PAIMON_ASSIGN_OR_RAISE(auto* string_builder,
CastToTypedBuilder<arrow::StringBuilder>(array_builder));
There was a problem hiding this comment.
* might be derived as a mathmatical multiplication, so clang-format put it in the middle?
Purpose
Linked issue: close #146
Tests
UT in
rangebtimap_file_index_test.cppIT in
paimon::test::ReadInteWithIndexTest::CheckResultForRangeBitmapdata is generated using paimon-java v1.3.1.
Same data, same queries, with single-chunk and multi-chunk, result should be the same.
tests are mainly written by AI, reviewed by human.
test coverage:
range_bitmap_file_index.cpp is a little low(82.7%) is because No write integration test to cover CreateWriter Method.
API and Format
Documentation
Generative AI tooling
Generated-by: Kimi K2.5