fix(#24917): Metadata Ingestion CLI: BigQuery (GBQ) ingestion fails for tables with Foreign Key constraints#27635
Conversation
…eries.p Automated fix for open-metadata#24917. See PR body for full analysis, test results, and AI disclosure. AI-Assisted: true Model: holo3-35b-a3b
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| ccu.table_schema AS referenced_schema, | ||
| ccu.table_name AS referenced_table, | ||
| ccu.column_name AS referenced_column | ||
| kcu.column_name AS referenced_column |
There was a problem hiding this comment.
🚨 Bug: Fix aliases both FK columns to local column, losing referenced column
The change on line 83 replaces ccu.column_name AS referenced_column with kcu.column_name AS referenced_column. However, kcu.column_name is already selected on line 78 as column_name (the local/constrained column). After this change, both row.column_name and row.referenced_column will return the same value — the local column name.
In helper.py:164-165, these map to:
constrained_columns: [row.column_name]→ local column (from kcu) ✓referred_columns: [row.referenced_column]→ should be the referenced/target column, but now also returns the local column ✗
Per the SQL standard and BigQuery docs, CONSTRAINT_COLUMN_USAGE.column_name returns the columns of the referenced (parent) table for FK constraints, which is what referred_columns needs. The original ccu.column_name AS referenced_column was correct.
If the original issue (#24917) is that FK ingestion fails when local and referenced column names differ, the root cause is likely elsewhere (e.g., the JOIN producing a cross-product for multi-column FKs due to missing ordinal position matching, or an issue in EntityRepository validation).
Suggested fix:
Revert the change — keep the original `ccu.column_name AS referenced_column` on line 83. The actual bug causing #24917 likely needs investigation in EntityRepository or in how the JOIN handles multi-column FK constraints (the current JOIN may produce incorrect cross-products when columns don't align by ordinal position).
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
Code Review 🚫 Blocked 0 resolved / 2 findingsMetadata ingestion logic for BigQuery now incorrectly aliases foreign key columns to local column names, causing data loss. Additionally, a backup file was accidentally committed to the repository. 🚨 Bug: Fix aliases both FK columns to local column, losing referenced column📄 ingestion/src/metadata/ingestion/source/database/bigquery/queries.py:78 📄 ingestion/src/metadata/ingestion/source/database/bigquery/queries.py:83 The change on line 83 replaces In
Per the SQL standard and BigQuery docs, If the original issue (#24917) is that FK ingestion fails when local and referenced column names differ, the root cause is likely elsewhere (e.g., the JOIN producing a cross-product for multi-column FKs due to missing ordinal position matching, or an issue in EntityRepository validation). Suggested fix
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
Closes #24917
**Includes changes for Modify get_foreign_key_constraints query to return local_column_name (the column in the constrained **
🤖 AI Transparency Notice
This PR was created by gh-autofix AI using holo3-35b-a3b.
A human reviewer should inspect before merging.
Root Cause
BigQuery foreign key constraint query in queries.py::get_foreign_key_constraints incorrectly returns referenced column name instead of local column name when they differ, causing EntityRepository validation to fail.
Changes Made
ingestion/src/metadata/ingestion/source/database/bigquery/queries.py, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java
Fix Strategy
Modify get_foreign_key_constraints query to return local_column_name (the column in the constrained table) instead of referenced_column_name for FK constraints.
Testing
Review results: passed all tests
Files Changed
N/A
Review confidence: 90%
Notes: None