[rust] typed column-vector dispatch#578
Conversation
915b80b to
8ea41cf
Compare
leekeiabstraction
left a comment
There was a problem hiding this comment.
Ty for the PR. Haven't managed to read it all but left a question. Will pick it up again tomorrow
| self.row_id | ||
| } | ||
|
|
||
| /// Returns the source `RecordBatch`. Panics on a nested cursor (returned |
There was a problem hiding this comment.
Curious on why panic as opposed to Error? Both ColumnarRow and this function are pub. IMO we should reduce foot guns if possible.
charlesdong1991
left a comment
There was a problem hiding this comment.
thanks for great PR, i will need some more time to read it through and learn a bit better about current behaviour in java, just a quick look, will take another look later 🙏
| None => arrow_row_column_indices(&record_batch), | ||
| }; | ||
| let schema = fluss_row_type.as_deref().unwrap_or(&row_type); | ||
| let typed = TypedBatch::build(&record_batch, schema) |
There was a problem hiding this comment.
so now build is fallible, and a malformed batch panics the scan instead of surfacing error?
| self.outer_array().is_null(row_id) | ||
| } | ||
|
|
||
| pub(crate) fn get_boolean(&self, row_id: usize) -> Result<bool> { |
There was a problem hiding this comment.
i wonder now since all is centralized in TypedColumn, probably it is cheap to bound check for top level getters?
| self.element.get_bytes(self.absolute_index(pos)?) | ||
| } | ||
|
|
||
| fn get_array(&self, pos: usize) -> Result<ArrayView<'_>> { |
There was a problem hiding this comment.
this is public right? should we update api-reference since the returned type has changed
| } | ||
| } | ||
|
|
||
| fn get_map(&self, pos: usize) -> Result<MapView<'_>> { |
Summary
closes #543
ColumnarRow now holds typed Arrow column vectors built once per batch instead of downcasting on every read. Nested ARRAY/MAP/ROW return lazy slice views - no copy unless the caller asks for a binary form via try_into_binary().
Mirrors Java's VectorizedColumnBatch.