-
Notifications
You must be signed in to change notification settings - Fork 23
[ISSUE-153] Add blocking poll into python bindings #154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a poll() method to the Python LogScanner bindings, enabling incremental, timeout-based record reading with a configurable timeout parameter.
Changes:
- Added
LogScanner.poll(timeout_ms)method with input validation and timeout handling - Implemented
create_empty_table()helper method to return properly-schemed empty tables on timeout - Added example usage demonstrating the new poll-based consumption pattern
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| bindings/python/src/table.rs | Implements the new poll() method with validation, error handling, and empty table creation helper |
| bindings/python/example/example.py | Adds example demonstrating usage of the new poll() method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fresh-borzoni Thanks for the pr. Only one question. PTAL
|
|
||
| let timeout = Duration::from_millis(timeout_ms as u64); | ||
| let scan_records = py | ||
| .detach(|| TOKIO_RUNTIME.block_on(async { self.inner.poll(timeout).await })) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I'm thinking can we use scanner#poll_batches directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thank you!
|
i need #211 to be merged, so that I can use offsets in |
55f7acb to
4cb28b5
Compare
|
@luoyuxia thanks for the review. Addressed comment. I updated to use new poll_batches everywhere, since it's more efficient and we avoid conversions when we don't need them. Also added missing slicing to to_arrow PTAL 🙏 |
4cb28b5 to
0333dd5
Compare
Purpose
Linked issue: close #153
Add a
poll()method to Python LogScanner for incremental, timeout-based record reading.Brief change log
LogScanner.poll(timeout_ms)method to Python bindingsAPI and Format
LogScanner.poll(timeout_ms: int) -> pyarrow.Table