Fix 1.13 ingestion import regression in patch_mixin#29156
Conversation
❌ PR checklist incompleteThis PR cannot be merged until the following are addressed on its linked issue:
The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically. Maintainers can bypass this check by adding the |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
🟡 Playwright Results — all passed (9 flaky)✅ 3900 passed · ❌ 0 failed · 🟡 9 flaky · ⏭️ 80 skipped
🟡 9 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
|
Code Review ✅ Approved 3 resolved / 3 findingsRemoves the missing ✅ 3 resolved✅ Edge Case: ContainerAdapter.set_columns silently drops columns when dataModel is None
✅ Quality: patch_column_tags mutates the caller's input entity
✅ Quality: Duplicated Table/Container dispatch and warning logic
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |



What
Fix the
1.13Python ingestion CLI startup failure by removing thepatch_mixin.pydependency onmetadata.sampler.entity_adapters, which does not exist on the1.13branch.Instead of backporting the
entity_adapterscleanup frommain, this keeps the fix local topatch_mixin.pyand handles the only classifiable entities supported on1.13:Tablevia.columnsContainervia.dataModel.columnsIt also preserves both existing SDK call shapes for
patch_column_tags:patch_column_tags(table=...)patch_column_tags(entity=...)Why
AUT sample data ingestion fails before the workflow YAML is read because importing
metadata.ingestion.ometa.mixins.patch_mixinraises:Root cause: #28837 was backported to
1.13with an import dependency on an adapter module that exists onmainfrom #27716, but that module was never present on1.13.Why this shape
The earlier adapter-module backport was more than this branch needs. On
1.13,ClassifiableEntityTypeis onlyTable | Container, so direct type handling inpatch_mixin.pyis the smallest fix that removes the bad import while preserving column tag patching and the optimistic-lock retry behavior from #28837.The Table/Container rules are centralized in a local helper instead of a new module.
patch_column_tagsnow uses the fetched server entity as the patch source and applies tag changes only to a deep-copied destination, so caller-owned entities are not mutated as a side effect.Regression coverage checked
I verified the touched logic with a focused stdlib-only smoke harness that imports the actual
patch_mixin.pyfile with stubbed external dependencies and exercises:Tablecolumn tag patchingContainer.dataModel.columnscolumn tag patchingContainerwithout caller-sidedataModel, using fetched server stateContainerwithout server-sidedataModeltable=andentity=keyword compatibilityTableorContainerValidation
python3 -m py_compile ingestion/src/metadata/ingestion/ometa/mixins/patch_mixin.pypatch_mixin.pysmoke harness:patch_mixin smoke okuvx ruff check ingestion/src/metadata/ingestion/ometa/mixins/patch_mixin.pygit diff --checkFull local package import was blocked because this checkout does not include generated metadata schema modules, and
uv runcannot resolve the project extras as currently declared.Greptile Summary
This PR fixes a
ModuleNotFoundErrorcrash on the1.13branch caused by a backport of #28837 that importedmetadata.sampler.entity_adapters, a module that only exists onmain. The fix removes that import and replaces it with directisinstancechecks onTableandContainer— the only twoClassifiableEntityTypemembers present on1.13.from metadata.sampler.entity_adapters import ...and replaces the adapter pattern with a_column_tag_patch_infohelper that returns the right field list and column reference for each supported type.patch_column_tagsbackward-compatible by accepting bothtable=(new preferred name) andentity=(old name), and changessourcefrom the caller-supplied entity to the freshly-fetched server instance, avoiding stale-state diffs.test_ometa_glossary.pyto match the 1.13GlossaryTerm.relatedTermsschema change fromEntityReferenceListtoList[TermRelation].Confidence Score: 5/5
Safe to merge — the change removes a broken import that caused a startup crash and replaces it with well-tested inline logic scoped to the two entity types present on 1.13.
The fix is narrowly scoped: one bad import removed, one small helper added, backward-compatible signature kept. The optimistic-lock retry logic from the original backport is preserved unchanged. New unit tests cover all primary paths, and the integration test update tracks a schema change already present on this branch. No correctness regressions were found.
No files require special attention.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["patch_column_tags(table= / entity=)"] --> B{table is None?} B -- yes --> Z1[return None + warn] B -- no --> C["_column_tag_patch_info(table)"] C --> D{isinstance Table?} D -- yes --> E["fields=['tags','columns'], columns=table.columns"] D -- no --> F{isinstance Container?} F -- yes --> G["fields=['tags','dataModel'], columns=table.dataModel.columns"] F -- no --> Z2[return None + warn unsupported] E & G --> H["_fetch_entity_if_exists(entity_type, table.id, fields)"] H --> I{instance found?} I -- no --> Z3[return None] I -- yes --> J["_prepare_destination_for_column_tags(instance, column_tags, operation)"] J --> K["_column_tag_patch_info(instance) → columns ref"] K --> L{columns is None?} L -- yes --> Z4[return None + warn] L -- no --> M["instance.model_copy(deep=True) → destination"] M --> N["_column_tag_patch_info(destination) → destination_columns ref"] N --> O["update_column_tags(destination_columns, ...) for each tag"] O --> P[return destination] P --> Q["patch(entity_type, source=instance, destination=destination, if_match=etag)"] Q --> R{412 Precondition Failed?} R -- yes, retries left --> H R -- yes, retries exhausted --> Z5[return None + warn] R -- no --> S[return patched_entity]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A["patch_column_tags(table= / entity=)"] --> B{table is None?} B -- yes --> Z1[return None + warn] B -- no --> C["_column_tag_patch_info(table)"] C --> D{isinstance Table?} D -- yes --> E["fields=['tags','columns'], columns=table.columns"] D -- no --> F{isinstance Container?} F -- yes --> G["fields=['tags','dataModel'], columns=table.dataModel.columns"] F -- no --> Z2[return None + warn unsupported] E & G --> H["_fetch_entity_if_exists(entity_type, table.id, fields)"] H --> I{instance found?} I -- no --> Z3[return None] I -- yes --> J["_prepare_destination_for_column_tags(instance, column_tags, operation)"] J --> K["_column_tag_patch_info(instance) → columns ref"] K --> L{columns is None?} L -- yes --> Z4[return None + warn] L -- no --> M["instance.model_copy(deep=True) → destination"] M --> N["_column_tag_patch_info(destination) → destination_columns ref"] N --> O["update_column_tags(destination_columns, ...) for each tag"] O --> P[return destination] P --> Q["patch(entity_type, source=instance, destination=destination, if_match=etag)"] Q --> R{412 Precondition Failed?} R -- yes, retries left --> H R -- yes, retries exhausted --> Z5[return None + warn] R -- no --> S[return patched_entity]Reviews (7): Last reviewed commit: "test(ingestion): align glossary related ..." | Re-trigger Greptile