-
Notifications
You must be signed in to change notification settings - Fork 68
feat: Add bigframes.pandas.job_history() API to track BigQuery jobs #2435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
|
||
|
|
||
| @dataclasses.dataclass | ||
| class JobMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a static factory method to build this from an sdk query job object?
| error_result: Optional[Mapping[str, Any]] = None | ||
| cached: Optional[bool] = None | ||
| job_url: Optional[str] = None | ||
| query: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do worry that at a certain point, storing all query test generated by the session might clog up memory?
sycai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the concern of placing job_history under the bigframes.pandas package. We may consider bigframes or session instances as the residing places, mainly because functionalities under bigframes.ml and bigframes.bigquery can also trigger jobs but they do not belong to bigframes.pandas.
This PR is not ready for review. I need it for colab notebook testing.
This PR introduces a new function bigframes.pandas.job_history() that allows users to retrieve a pandas DataFrame listing the BigQuery jobs initiated by BigFrames in the current Python session. This provides visibility into the underlying BigQuery execution, including query text, resource usage, and job duration, which is invaluable for monitoring and optimization.
Key Changes:
Usage Example:
verified at vs code notebook: screen/4MopaYNmCAQ5jZh
Fixes #<481840739> 🦕