Skip to content

feat: support schema view#211

Merged
tswast merged 19 commits intomainfrom
schema
Feb 10, 2026
Merged

feat: support schema view#211
tswast merged 19 commits intomainfrom
schema

Conversation

@ericfe-google
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

We now only include query results directly in the html when the query results are less than 100 KB. For larger query results, we store only the reference to the destination table in the HTML, and have the python code re-read the query results from the destination table during the callback.

Also, added a hard limit of 5 MB in the query result size, beyond which, graph visualization is not supported altogether.
…d Spanner. This avoids clashes if the same notebook uses visualizers for both BigQuery graphs and Spanner graphs, which can cause malfunctions in the visualizers.
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-magics API. labels Feb 6, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ericfe-google, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the BigQuery Magics' graph visualization capabilities by integrating schema information directly into the graph rendering process. By programmatically fetching and converting graph metadata from BigQuery's INFORMATION_SCHEMA, the visualization widget can now present more structured and contextually rich graphs, improving the user's understanding of their data relationships. This change streamlines the workflow for users interacting with property graphs in BigQuery.

Highlights

  • Graph Schema Retrieval: Introduced functionality to automatically retrieve graph schema information from BigQuery's INFORMATION_SCHEMA for GRAPH queries. This allows the visualization widget to display richer, schema-aware graphs.
  • Schema Conversion Logic: Added a new utility function _convert_schema in graph_server.py to transform the BigQuery-specific graph metadata JSON into a standardized format expected by the graph visualization framework.
  • Integration with Graph Widget: Modified the _add_graph_widget function to accept the BigQuery client and query text, enabling it to fetch and pass the converted schema to the graph visualization HTML generation process.
  • Enhanced Test Coverage: Added new unit tests to cover various scenarios for graph schema handling, including successful retrieval, cases where no graph name is found, and when a schema is not present in the INFORMATION_SCHEMA.
Changelog
  • bigquery_magics/bigquery.py
    • Added _get_graph_name to extract graph identifiers from SQL queries.
    • Added _get_graph_schema to query BigQuery's INFORMATION_SCHEMA for graph metadata.
    • Modified _add_graph_widget to incorporate schema retrieval and pass it to the visualization component.
    • Updated _make_bq_query to pass necessary arguments to _add_graph_widget.
  • bigquery_magics/graph_server.py
    • Introduced _convert_schema function to transform BigQuery graph metadata JSON into a visualization-friendly format.
    • Updated _convert_graph_data to accept and utilize the converted schema.
    • Modified convert_graph_params to parse and pass schema information from incoming parameters.
  • tests/unit/bigquery/test_bigquery.py
    • Added test_add_graph_widget_with_schema to verify correct schema retrieval and usage.
    • Added test_add_graph_widget_no_graph_name to ensure schema retrieval is skipped for non-graph queries.
    • Added test_add_graph_widget_schema_not_found to handle cases where graph schema is unavailable.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ericfe-google ericfe-google changed the title Schema feat: support schema view Feb 6, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds functionality to retrieve and display graph schemas by parsing the graph name from a query, fetching the schema from INFORMATION_SCHEMA, and passing it to the visualization framework. The overall approach is sound, but I've identified a critical SQL injection vulnerability that must be addressed. Additionally, I've noted some fragile logic related to string escaping and testing that could be improved for better maintainability.

# Verify generate_visualization_html was called with the converted schema
assert gen_html_mock.called
params_str = gen_html_mock.call_args[1]["params"]
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to parse params_str by manually reversing escaping is brittle and is repeated in the new tests (lines 1163, 1207, 1252). This can easily break if the escaping logic in _add_graph_widget changes or for more complex inputs. Assuming the problematic escaping in _add_graph_widget is fixed as per my other comment, you can simplify this to just load the JSON string.

Suggested change
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
params = json.loads(params_str)

# Verify generate_visualization_html was called without a schema
assert gen_html_mock.called
params_str = gen_html_mock.call_args[1]["params"]
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test logic is brittle because it tries to manually reverse the escaping done in _add_graph_widget. This can easily break if the escaping logic changes or for more complex inputs. A better approach would be to simplify the test by just loading the JSON string, assuming the problematic escaping in _add_graph_widget is fixed.

Suggested change
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
params = json.loads(params_str)

# Verify generate_visualization_html was called without a schema
assert gen_html_mock.called
params_str = gen_html_mock.call_args[1]["params"]
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test logic is brittle because it tries to manually reverse the escaping done in _add_graph_widget. This can easily break if the escaping logic changes or for more complex inputs. A better approach would be to simplify the test by just loading the JSON string, assuming the problematic escaping in _add_graph_widget is fixed.

Suggested change
params = json.loads(params_str.replace('\\"', '"').replace("\\\\", "\\"))
params = json.loads(params_str)

@ericfe-google ericfe-google marked this pull request as ready for review February 6, 2026 22:53
@ericfe-google ericfe-google requested review from a team as code owners February 6, 2026 22:53
@ericfe-google ericfe-google requested a review from jialuoo February 6, 2026 22:53
Comment on lines +671 to +673
info_schema_results = bq_client.query(
info_schema_query, job_config=job_config
).to_dataframe()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed, but FYI you'll get the best performance with query_and_wait() since that allows for the fast (jobless) query path to kick in.

@tswast tswast merged commit 8e1883e into main Feb 10, 2026
24 of 25 checks passed
@tswast tswast deleted the schema branch February 10, 2026 15:52
tswast added a commit that referenced this pull request Feb 10, 2026
PR created by the Librarian CLI to initialize a release. Merging this PR
will auto trigger a release.

Librarian Version: v0.8.0
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:1a2a85ab507aea26d787c06cc7979decb117164c81dd78a745982dfda80d4f68
<details><summary>bigquery-magics: 0.12.0</summary>

##
[0.12.0](v0.11.0...v0.12.0)
(2026-02-10)

### Features

* support schema view (#211)
([8e1883e](8e1883ee))

* remove bqsql magic to make that name available for bigframes (#210)
([c46c94a](c46c94af))

### Bug Fixes

* reduce conflicts between Spanner and BigQuery graph visualization on
Colab (#209)
([7dca7b1](7dca7b13))

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-magics API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants