Conversation
There was a problem hiding this comment.
Pull request overview
This PR strengthens Snowflake ingestion stability by making cluster-key expression parsing robust to deeply nested CLUSTERING_KEY expressions that previously could trigger RecursionError and crash the ingestion process.
Changes:
- Replace the recursive cluster-key token walker with an iterative DFS to avoid Python recursion limits.
- Add explicit
RecursionErrorhandling inparse_column_name_from_exprto log a warning and return an empty list. - Add unit tests that simulate very deep cluster-key token trees and validate ordering + logging behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
ingestion/src/metadata/ingestion/source/database/snowflake/metadata.py |
Switches cluster-key identifier extraction to an explicit-stack DFS and adds a RecursionError fallback path with warning + safe empty result. |
ingestion/tests/unit/topology/database/test_snowflake.py |
Adds regression tests covering deep nesting, identifier DFS order preservation, repeated parsing, and downstream RecursionError logging behavior. |
| from sqlparse.sql import Identifier as _SqlparseIdentifier # noqa: E402 | ||
|
|
||
|
|
There was a problem hiding this comment.
There’s a mid-file import (from sqlparse.sql import Identifier as _SqlparseIdentifier) guarded by # noqa: E402. This will fight the repo’s Python formatting tooling (e.g., isort) and makes imports harder to reason about. Please move this import up with the other module imports (near the existing from sqlparse.sql import Function) and drop the E402 suppression.
Code Review ✅ ApprovedAdds comprehensive unit tests for Snowflake cluster keys to ensure proper schema validation. No issues found. OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
🔴 Playwright Results — 1 failure(s), 15 flaky✅ 3695 passed · ❌ 1 failed · 🟡 15 flaky · ⏭️ 89 skipped
Genuine Failures (failed on all attempts)❌
|



Describe your changes:
Fixes
I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Summary by Gitar
RecursionErrorand ingestion pod crashes.RecursionErrorhandling inparse_column_name_from_expras a defensive measure against deep expression trees.test_snowflake.pycovering deep recursion scenarios, DFS order preservation, and error handling for pathologically complex cluster keys.This will update automatically on new commits.