Skip to content

Fix ODPS import + data-product metadata bugs (#27600)#29155

Open
harshach wants to merge 529 commits into
mainfrom
odps-import-data-product-metadata-fixes
Open

Fix ODPS import + data-product metadata bugs (#27600)#29155
harshach wants to merge 529 commits into
mainfrom
odps-import-data-product-metadata-fixes

Conversation

@harshach

@harshach harshach commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

The ODPS / Data Products feature (#27600) shipped with several bugs in its import, metadata-edit, and merge paths. They were surfaced by a new Playwright suite for the ODPS flow (which had no E2E coverage). All fixes were verified end-to-end on a live stack (via the 1.13 backport, companion PR #29154); this lands the same fixes on main.

Bugs fixed

  1. DataProduct PATCH never persisted dataProductType/visibility/portfolioPriority. DataProductUpdater.entitySpecificUpdate() recorded changes only for name and domains, so change-consolidation reverted these ODPS-aligned scalar fields on store — the PATCH response showed the value but a subsequent GET returned null, and the metadata-edit modal silently no-opped. Added recordChange for the three fields.

  2. ODPS import always returned 400 "Unknown custom field odpsMetadata". ODPSConverter.fromODPS wrote the raw document into extension.odpsMetadata, which is not a registered custom property, so validateExtension rejected every import. toODPS never reads it back, so the write was dead weight — removed it; merge/replace now preserve the existing product's real custom properties.

  3. ODPS import returned 409 because the domain reference built from ?domain= had no id. Resolved via Entity.getEntityReferenceByName.

  4. ODPS create-from-import returned 409 (null primary key) because the converted entity had no id. Now assigns a fresh UUID like the normal create path.

  5. ODPS merge/replace wiped owners/domains/experts/reviewers/certification on an existing product. findExistingByName loaded it with only id,name,version, so the lazy relationship fields came back null and smartMerge/fullReplace copied those nulls — and since domains is required, a PUT against an existing product either 400'd or stripped governance fields. Now loads with EXPORT_FIELDS so those fields are hydrated before merging.

  6. ODPSImportModal name guard matched the first name: key via regex, which can hit an SLA dimension or tag rather than the product name (product.details.<lang>.name). Now parses the YAML with js-yaml and reads that path.

Tests

New DataProductODPS.spec.ts (8 tests): export, validate (valid + invalid), export→rename→import round-trip, merge preserves domain + owners (regression guard for #5), and the UI export-menu / import-modal / metadata-edit-modal flows.

Verified on a fresh local stack: ODPS 8/8 green. (Excludes the 1.13-only legacy-form changes from #29154, since main uses the HookForm AddDomainForm.)

🤖 Generated with Claude Code


Summary by Gitar

  • DataProduct fixes:
    • Fixed PATCH persistence for dataProductType, visibility, and portfolioPriority in DataProductUpdater.
    • Removed extension.odpsMetadata to resolve 400 errors and ensure compatibility with custom field registration.
    • Added Entity.getEntityReferenceByName to fix 409 domain conflicts and UUID assignment for import-created entities.
    • Hydrated relationships using EXPORT_FIELDS during smartMerge/fullReplace to prevent data loss of governance fields.
    • Replaced regex with js-yaml in ODPSImportModal for accurate product name extraction.
  • Core entity/container improvements:
    • Optimized container cache invalidation and batch-fetched derived tags for bulk indexing operations.
    • Refactored EntityRepository caching to use weighted memory-based limits with CacheConfiguration.
    • Added regression testing for ODPS import name guards in DataProductODPS.spec.ts.

This will update automatically on new commits.

harshach and others added 30 commits May 16, 2026 12:09
* Add connector details to http headers

* fix(ingestion): sanitize User-Agent before sending to OM server

A user-controlled serviceName (or any other value) carrying CR/LF or
other control characters would either crash the request with
requests/httpx InvalidHeader or, worse, smuggle in a second header line.
Strip non-printable ASCII, trim, and cap at 256 chars at every sink
(REST session, SSE stream) and at the source where serviceName is
interpolated. When sanitization leaves nothing usable, drop the header
so the default agent is sent instead of a malformed one.
…els (#28199)

* fix(rdf): scope glossary graph by membership and surface glossary labels

The /rdf/glossary/graph endpoint silently ignored its `glossaryId` filter
and returned every term in every glossary, because the SPARQL query bound
`?glossary` via `OPTIONAL { ?term1 om:belongsTo ?glossary }` while the
JSON-LD context (governance.jsonld) maps GlossaryTerm.glossary to
`om:belongsToGlossary`. The downstream `FILTER(?glossary = <…>)` was a
no-op, and the UI hierarchy view rendered every glossary's terms grouped
by their raw UUID — most visibly for cross-glossary scenarios.
… aliases (#28064)

Cherry-pick of 6c8441a from main, adapted for 1.13:
- Skipped DistributedReindexFinalizer + RatioPromotionPolicy integration in
  DistributedIndexingStrategy (1.13 still uses IndexingStrategy interface +
  recreateIndexHandler.finalizeReindex path; the new finalizer class belongs
  to the post-#27971 distributed-only architecture not yet on 1.13).
- Surgically applied Entity.java additions (EntityIndexCapability registration,
  IndexMappingValidator hook, isTimeSeriesEntity helper) without clobbering
  1.13-only fields (SuggestionRepository, FOLDER/CONTEXT_FILE/ANNOUNCEMENT/
  TASK_FORM_SCHEMA constants) or pulling in DomainSyncHandler from main.
- Surgically applied EntityTimeSeriesRepository.java additions (readTimeSeriesSource
  helper for legacy 'deleted' field scrub) while preserving 1.13's
  shouldSkipSearchResultOnInheritedFieldError defensive logic.

(cherry picked from commit 6c8441a)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scripts/loadgen.py is a configurable async generator that produces a realistic
asset mix for performance/soak testing on large OpenMetadata instances.

Key design choices vs. the older per-entity ingest_100k_*.py scripts:
  - Bulk endpoints (PUT /{entity}/bulk) for the 8+ types that expose them,
    cutting HTTP calls + ChangeEvents + search bulk requests by ~100x.
  - Inline enrichment (tags, glossary terms) ride in the initial create body
    so we avoid a follow-up PATCH wave — that path saturated the search bulk
    pipeline on the 1.13 stack at 400k tables.
  - Bulk domain assignment via /domains/{name}/assets/add — one batch call
    instead of per-entity PATCHes.
  - Adaptive throttling: sliding-window 5xx/timeout rate triggers backoff,
    preventing the client-retry death spiral when the server is saturated.
  - --total + --mix derive per-entity counts from a single target so a single
    knob scales the whole dataset from 10k → 1M.

Also creates governance objects (domains, data products, glossaries, terms),
data-quality test cases with mixed Success/Failed results, and progresses
incidents through New → Ack → Assigned → Resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New --token / OM_TOKEN flag accepts a JWT directly (bot token from UI,
  long-lived service account, etc.). Written to --token-file before run.
  Login flow (--email/--password) still works as a fallback.
- scripts/LOADGEN.md documents auth modes, common --total / --mix recipes
  (smoke, 100k, 500k, 1M), tuning knobs, what gets created, cleanup
  recipe, and a troubleshooting table for the failure modes that came up
  during the cherry-pick soak test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rip HTML in related-term tooltip (#28203)

* fix(glossary): render every relation type pair on the relations graph and strip HTML from related-term tooltip

The Ontology / Relations Graph deduped edges by (from, to) without
relationType, so multiple relations between the same pair of terms (e.g.
relatedTo + partOf) collapsed to a single edge. Fix the three dedup sites
(buildGraphFromAllTerms, convertRdfGraphToOntologyGraph, mergeEdges,
mergeGraphResults / loadDataModeTerms) so each canonical relation type
becomes its own merged edge; inverse pairs like partOf/hasPart still
combine into one bidirectional edge.

The Related Terms tooltip on a glossary term overview also rendered
entity.description as raw text, so HTML stored by the rich-text editor
(e.g. <p>...</p>) leaked into the tooltip. Strip with
getTextFromHtmlString.

Adds graphBuilders unit tests, exposes mergedEdgesList via
data-edges on the graph container, and a Playwright spec that asserts
both relation types render between the same pair.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): handle asymmetric inverse pairs in mergeEdges

INVERSE_RELATION_PAIRS is not fully bidirectional (e.g. composedOf maps
to partOf but partOf maps to hasPart). The previous refactor used a
canonical key derived from a one-way lookup, which split such pairs into
separate groups and rendered them as two unidirectional edges instead of
one bidirectional one.

Replace the canonical-key grouping with a pair-only grouping and an
isInversePair helper that checks the lookup in both directions, so an
asymmetric mapping still merges correctly. Add unit tests covering
symmetric, inverse, asymmetric-inverse, multi-relation, single-direction,
and same-non-symmetric-type cases.

Addresses Gitar review on #28203.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): honour user-configured relation types in mergeEdges

mergeEdges used a hardcoded INVERSE_RELATION_PAIRS / SYMMETRIC_RELATIONS
table, so user-defined relation types from GlossaryTermRelationSettings
(with their own inverseRelation / isSymmetric flags) never matched as
bidirectional pairs on the graph — both directions rendered as separate
unidirectional edges.

Thread the configured relationTypes list from useOntologyExplorer through
OntologyGraph and useGraphDataBuilder into mergeEdges, which now overlays
the runtime config on top of the hardcoded defaults. inverseRelation
declared in one direction is also recorded in reverse when missing, so
the same edge resolves whichever side fires first.

Add unit tests for:
- runtime inverse pair (both directions configured)
- runtime symmetric relation
- inferred reverse inverse mapping (only forward direction configured)
- unknown custom relation stays unidirectional

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): isolate multi-relation fixtures to a dedicated glossary

Adding partOf between term1 and term2 in beforeAll polluted the shared
fixture for the rest of the OntologyExplorer spec, breaking three
unrelated tests:
  - 'entity panel should display outgoing or incoming relations section'
    and 'Relations tab should show the related term by name' failed
    strict-mode because partOf surfaced BOTH outgoing and incoming
    labels for the same node.
  - 'should show hierarchy empty state when no hierarchical relations'
    failed because partOf is a hierarchical relation type so the
    glossary was no longer empty in Hierarchy view.

Create a separate multiRelGlossary with multiRelTermA / multiRelTermB
and add the relatedTo + partOf pair there. The multi-relation test now
filters by that glossary, leaving term1 / term2 with only the original
relatedTo edge so existing tests continue to assert exactly one
direction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(glossary): drop hardcoded relation maps, trust backend settings

INVERSE_RELATION_PAIRS and SYMMETRIC_RELATIONS in useGraphData.ts were
legacy from before GlossaryTermRelationSettings became configurable. The
backend now seeds the authoritative definitions via the 1.13.0 migration
(each relation type carries inverseRelation, isSymmetric, category,
etc.) and the UI already fetches them in useOntologyExplorer and threads
them through to mergeEdges.

Keeping a hardcoded fallback created the exact drift Gitar flagged:
the hardcoded composedOf -> partOf disagrees with the seeded RDF
ontology (composedOf inverseOf componentOf). Removing it makes the
backend the sole source of truth.

Fail-safe behavior: if getGlossaryTermRelationSettings() fails (the
catch returns empty), edges still render — they just stay
unidirectional instead of merging into bidirectional pairs. Better
than fabricating semantics we cannot verify.

Tests are updated to pass an explicit seededRelationTypes config that
mirrors the migration JSON, plus a new "no settings provided" case
covering the fail-safe path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit a7f487c)
…28145) (#28210)

The /v1/users/online endpoint computed `now - timeWindow*60*1000` for the
SQL threshold. When the UI sent `timeWindow=0` for the "All time" option,
the threshold collapsed to `now`, producing `lastActivityTime > now` —
which matched zero rows. Skip the time predicate entirely when
`timeWindow <= 0` so the filter falls through to "all non-bot users".

Reported via #27993.
…8047) (#28212)

PATCH on a File without columns (PDF/image/etc.) NPEs in
ColumnEntityUpdater.updateColumns because FileRepository.entitySpecificUpdate
unconditionally invokes the columns updater. recordListChange already
null-coalesces internally, but the subsequent `for (Column updated :
updatedColumns)` iteration does not — any updater calling updateColumns
with a null list hits the same path.

Null-coalesce origColumns and updatedColumns at the top of
ColumnEntityUpdater.updateColumns so any optional-columns entity is safe.
Also guard `added.getTags().stream()` with listOrEmpty so a tag-less added
column does not NPE.

Adds test_patchFileWithoutColumns_doesNotNpe in FileResourceIT that
creates a PDF file (no columns), PATCHes the description, and asserts
the response is 200 — fails on this branch without the fix.

Cherry-picked code + regression test from #28047 to 1.13. The Folder
files and BaseEntityIT migration from the original PR do not apply
because the Folder entity (#27558 "Context center") is not in this
release.
(cherry picked from commit 71893d5)
… and ES_INDEX_MAP (#28104)

* fix(ingestion): add Drive entities to ENTITY_REFERENCE_CLASS_MAP

The Automator app resolves resource types from resources.type against
ENTITY_REFERENCE_CLASS_MAP. Drive entities (directory, file, spreadsheet,
worksheet) and the driveService were missing, so selecting them in the
Add Automation UI failed at runtime with:

  AutomatorException: Can't get class from resource type: directory

Register the Drive data entities and driveService so the Automator can
paginate and act on them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Adding missng drive entity to ES MAP

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Khairajani <himanshukhairajani8@gmail.com>
Co-authored-by: Himanshu Khairajani <46777429+Khairajani@users.noreply.github.com>
* make compressed dag ingestion available

* address comments + failing test

* fix test

* update return type from any

* change f strings to %s

* fix statics
…igger (#28109)

* fix(ingestion-pipeline): inherit owners from service and authorize trigger (#27962)

* fix(policy): grant Trigger to default bot + steward policies for /trigger authz

* fix(test): use java.time.Instant instead of Joda-Time per code review

* fix(policy): use a dedicated TriggerRule on DataStewardPolicy instead of mixing into EditRule

* fix(migration): drop generic exception catch + inline rule description per code review

* fix(policy): add trailing newline to DataStewardPolicy.json per code review

(cherry picked from commit 4350922)
…lter plumbing (#28084)

* fix(dq-dashboard): denormalize certification onto testCase index and add filter plumbing

* fix(dq-dashboard): cascade certification to all table-child search indices

(cherry picked from commit 1b35775)
…break a schema (#28060)

* fix(ingestion): isolate per-entity failures so one bad table doesn't break a schema

A Snowflake table whose name cannot be FQN-quoted (e.g. an embedded
newline coming from a backup script that forgot to strip a trailing
"\n") used to break the entire schema's ingestion through several
non-isolated code paths:

- snowflake/utils.py::_get_schema_columns iterated information_schema.columns
  rows; the first bad row caused @reflection.cache to cache the
  exception, so every subsequent get_columns() call in the same schema
  re-raised it. Every valid table in the schema ended up ingested with
  columns=[].
- snowflake/metadata.py::_get_table_names_and_types's deleted-tables FQN
  listcomp aborted on the first bad deleted name and dropped the rest.
- common_db_source.py::get_tables_name_and_type wrapped the entire
  table+view iteration in a single try/except. One fqn.build() failure
  ended the generator, so good tables yielded after the bad one were
  silently skipped.
- topology_runner.py::_process_stage caught only ValueError, so an
  APIError from get_by_name (when the server's quoteName rejected a
  bad FQN) halted the whole stage and skipped every remaining entity
  in the schema.

Each site is now per-entity-isolated: a single bad entry is logged at
WARNING and skipped; the rest of the schema continues. Per-entity
failures stay as warnings -- they do not escalate to status.failed --
so a known-noisy table that the user has already filtered out doesn't
trip WorkflowExecutionError on every run.

Adds focused unit tests for each of the four fault-isolation sites and
flips the existing quote_name newline test to assert the rejection
(matching the OM server's contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ingestion): satisfy ruff format + basedpyright for CI

- Run ruff 0.15.12 format over the two files CI flagged.
- Tighten get_tables_name_and_type return annotation from
  Iterable[Tuple[str, str]] to Iterable[Tuple[str, TableType]] so the
  generator's yielded type matches its declared signature.
- Suppress 3 basedpyright reportAttributeAccessIssue errors in
  snowflake/metadata.py at the deleted-tables FQN block via targeted
  `# pyright: ignore` comments. These attribute accesses
  (`SnowflakeTableList.get_deleted/get_not_deleted` and the
  `database`/`database_service` keys on TopologyContext) are the same
  patterns elsewhere in the file -- they only became "new" errors
  because my refactor shifted their column positions out of the
  baseline-matched range.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* revert(ingestion): keep topology_runner._process_stage exception narrow

Revert the broadening of `except ValueError` to `except Exception` in
TopologyRunnerMixin._process_stage and the accompanying
`_entity_request_label` helper. With the per-connector defenses in
common_db_source.get_tables_name_and_type and snowflake_utils.
get_schema_columns, the bad-name scenarios that prompted this widening
no longer reach the topology runner at all. The cross-cutting change
to _process_stage will be filed as a separate PR with its own
justification so it can be reviewed independently from the connector-
level customer fix.

Also restores test_runner.py to its pre-PR structure; PerEntityIsolationTest
is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(ingestion): suppress pre-existing basedpyright errors flagged in CI

CI's static-checks job (basedpyright 1.39.3 with --baselinemode=discard)
surfaced 11 pre-existing errors across 7 unrelated files. These are all
baseline-drifted -- same code patterns exist on main but their baseline
entries no longer match (stub or column drift). Adding targeted
`# pyright: ignore[<code>]` comments on each line so this customer
fix PR can land without bundling a full baseline regeneration:

- ometa/utils.py, dashboard/tableau/metadata.py, snowflake/models.py,
  utils/entity_link.py: reportPrivateImportUsage on requests submodule
  imports (`quote`, `urlparse`, `unquote_plus`).
- dashboard/grafana/client.py, database/dbt/dbt_config.py:
  reportOptionalMemberAccess on `err.response.status_code` /
  `exc.response.status_code` inside HTTPError handlers (response is
  None-typed but always non-None inside `requests.exceptions.HTTPError`).
- mcp/client.py: reportArgumentType on `json=` kwarg passed to
  `requests.Session.post`.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te test (#28221)

The 'Create data product via UI with description' test was failing in
CI with a Playwright strict-mode violation:

  expect(locator).toBeVisible() failed
  Locator: getByTestId('alert-icon')
  strict mode violation: getByTestId('alert-icon') resolved to 2 elements:
    1) page-layout-v1 > alert-icon       (AlertBar)
    2) #notistack-snackbar > alert-icon  (NotificationMessage)

The toastNotification call in the spec is redundant: the preceding
'await createRes' on the /api/v1/dataProducts POST already syncs on
the backend create, and the next step asserts the new data product
card is visible by name. Removing the toast assertion drops the
strict-mode collision without weakening test coverage.

Co-authored-by: Siddhant <siddhant@MacBook-Pro-751.local>
(cherry picked from commit 55e9131)
…dashboard (#28085)

* fix(dq-dashboard): add Certification filter dropdown and exclude it from Tag dropdown

* test(dq-dashboard): add Playwright coverage for the Certification filter

* fix(tag-page): route Tier/Certification tag pages through their dedicated DQ filter keys

* test(dq-dashboard): cover TagPage routing for Certification detail page

* chore(playwright): apply Prettier formatting to CertificationFilter spec

* fix(dq-dashboard): make Certification search filter from the full options list

* test(dq-dashboard): cover independent Tier/Certification fetch gating

* test(dq-dashboard): apply review feedback on CertificationFilter spec

(cherry picked from commit 3988587)
* Add multiple related term show badge on edges

* fix lint issue

(cherry picked from commit fa4ef3c)
* chore(ci): Snyk Job Summary + high/critical gate + unified Slack notify

* nit

* address gitar

* beautify message
…ataQuality tests (#28226)

- Use `.first()` with `expect().toBeVisible()` instead of `.waitFor()` on
  multi-match locator `[data-testid="status-data-widget"]` (8 elements)
- Add `page.mouse.move(0, 0)` before sidebar hover in `sidebarClick` to
  dismiss tooltips that intercept pointer events
- Wrap all test bodies in `test.step()` for readability and granular
  failure reporting

Fixes open-metadata/openmetadata-collate#4132

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…6.0 (#28242)

* fix(rdf): reclaim Fuseki disk via compaction + upgrade Jena 4.10 → 5.6.0

PR #28117's SPARQL cleanup converged the logical RDF state but never freed
disk: TDB2 deletes only mark blocks free and the journal grows monotonically
until /$/compact runs. RdfIndexApp.clearRdfData() now calls a new
RdfStorageInterface.compactStorage() between clearAll() and reloadOntologies()
so each recreate run reclaims to a fresh dataset directory. JenaFusekiStorage
posts to /$/compact/{dataset}?deleteOld=true and polls /$/tasks/{id} until
finished, with failures logged and swallowed (disk hygiene, not correctness).

Also unifies the Jena classpath: openmetadata-service was on 4.10.0 and
openmetadata-integration-tests on 5.0.0. Both now pin to 5.6.0 via a single
root pom property, dropping the apache-jena-libs BOM in favour of explicit
jena-core/arq/rdfconnection deps (we're a remote-Fuseki client and never
embed TDB; pulling jena-tdb1/2 triggers a Jena 5/6 static-init regression).
Picks up CVE-2025-49656 and CVE-2025-50151 (admin-side fixes shipped in
Jena 5.5.0). Jena 6.x parked: both 6.0.0 and 6.1.0 hit a recursive clinit
bug where TypeMapper.reset reads RDF.dtLangString before RDF.<clinit>
completes.

Fuseki server bumped 4.10/5.0 → 5.6.0 across all in-repo Dockerfiles; the
unmaintained stain/jena-fuseki:* image references in dev compose files
switched to building from docker/rdf-store/Dockerfile, and Testcontainers
moved to secoresearch/fuseki:5.5.0 (maintained, CVE-fixed; the dataset is
created by JenaFusekiStorage.ensureDatasetExists() so the stain-only
FUSEKI_DATASET_1 env var is no longer needed).
…n to avoid ES mapping depth limit (#28214)

* fix(search): flatten nested schemaFields children to avoid ES mapping depth limit

Search reindexing failed at the Sink stage with "Failed to parse" for
Topics and API Endpoints whose schemas nest records deeper than ~17
levels. The recursive messageSchema/requestSchema/responseSchema
`schemaFields[].children` tree pushed the Elasticsearch object path past
the default `index.mapping.depth.limit` of 20, so the bulk item was
rejected with a mapper_parsing_exception.

Map the recursive `children` field as `flattened` (auto-translated to
`flat_object` on OpenSearch by OsUtils) so the entire child subtree
collapses into a single field, capping object depth at 3 regardless of
how deep the schema nests. Top-level `schemaFields.{name,description,...}`
keep their normal analyzed mapping, so search/sort/aggregations there are
unchanged. zh/topic had no `children` mapping at all, so it is added to
prevent dynamic mapping from re-introducing the depth blow-up.

Drop the `schemaFields.children.keyword` boosted fields from
TopicIndex/APIEndpointIndex getFields() since `flattened` has no
`.keyword` sub-field.

Fixes #4122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): flatten nested schemaFields children to avoid ES mapping depth limit

Search reindexing failed at the Sink stage with "Failed to parse" for
Topics and API Endpoints whose schemas nest records deeper than ~17
levels. The recursive messageSchema/requestSchema/responseSchema
`schemaFields[].children` tree pushed the Elasticsearch object path past
the default `index.mapping.depth.limit` of 20, so the bulk item was
rejected with a mapper_parsing_exception.

Map the recursive `children` field as `flattened` (auto-translated to
`flat_object` on OpenSearch by OsUtils) so the entire child subtree
collapses into a single field, capping object depth at 3 regardless of
how deep the schema nests. Top-level `schemaFields.{name,description,...}`
keep their normal analyzed mapping, so search/sort/aggregations there are
unchanged. zh/topic had no `children` mapping at all, so it is added to
prevent dynamic mapping from re-introducing the depth blow-up.

Drop the `schemaFields.children.keyword` boosted fields from
TopicIndex/APIEndpointIndex getFields() since `flattened` has no
`.keyword` sub-field.

Fixes open-metadata/openmetadata-collate#4122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): flatten nested children for table, container, dashboardDataModel, worksheet, searchIndex

Extends the schemaFields depth-limit fix to the remaining entities whose
recursive `columns`/`fields` `children` trees blow past Elasticsearch's
default `index.mapping.depth.limit` of 20, failing search indexing with
mapper_parsing_exception ("Failed to parse"):

- table, dashboardDataModel, worksheet  -> columns.children
- container                             -> dataModel.columns.children
- searchIndex                           -> fields.children

Map the recursive `children` field as `flattened` (auto-translated to
`flat_object` on OpenSearch) in all 4 locales. table/dashboardDataModel/
container had no `children` mapping at all, so it is added to stop dynamic
mapping from re-introducing the depth blow-up; worksheet/searchIndex had
it as a one-level object and are converted.

Also drop `.keyword` references to the now-flattened children fields,
which `flattened` does not provide:
- SearchEntityIndex.getFields(): remove `fields.children.name.keyword`
- searchSettings.json: retarget Topic/APIEndpoint/Table exact-match
  boosts and the documented field list off `*.children*.keyword` onto
  the flattened `*.children.name` virtual keys (addresses PR review).

Fixes open-metadata/openmetadata-collate#4122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ui): update search settings mock for flattened columns.children

The "Restore default search settings" Playwright test deep-equals the
reset `table` config against `mockEntitySearchConfig`. Update the mock's
`columns.children` searchField from `columns.children.name.keyword` to
`columns.children.name` to match searchSettings.json, since the
flattened `children` mapping no longer exposes a `.keyword` sub-field.

Fixes open-metadata/openmetadata-collate#4122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): drop flattened children field from container highlightFields

`dataModel.columns.children.name` was listed in the container
`highlightFields`. Now that `children` is mapped as `flattened`
(`flat_object` on OpenSearch), highlighting a flat-object sub-field is
rejected — OpenSearch fails the search with
`search_phase_execution_exception` (HTTP 400), so every container
`/api/v1/search/query` returned 500 and the Explore summary panel could
not open.

Remove the field from container `highlightFields`. Highlighting of
deeply-nested child column names is dropped (accepted flattened
trade-off); top-level column highlighting is unaffected. Container is
the only entity whose `highlightFields` referenced a recursive
`children` path.

Fixes open-metadata/openmetadata-collate#4122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 2b70fb4)
…mports for 1.13

Renames searchSeparationSuite.ts → SearchSeparationSuite.ts and updates
all spec file imports accordingly. Adds SidebarItem, waitForAllLoadersToDisappear,
searchAndClickOnOption, and sidebarClick imports that were missing in the 1.13
version of the suite, aligning it with the search-separation-test branch fixes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ght panel (#28248)

* test(playwright): wait on search API instead of loader in Overview right panel

* add slow in ingestion bot
* fix import issue search separation

* nit

* nit
…ance (#28079)

* feat(ui): lazy load home page widgets to improve initial load performance

Replace static widget imports with React.lazy() in CustomizeMyDataPageClassBase
and wrap widget renders in Suspense with a skeleton fallback in
CustomizableLandingPageUtils. Each widget now loads as a separate JS chunk,
reducing the initial bundle size of the home page.

Closes open-metadata/openmetadata-collate#4075

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove unused files and functions

* minor fix

* minor fix

* revert deleted constants

* address comments

* minor fix

* revert lodash-es to lodash

* fix(playwright): fix strict mode violation and tooltip intercept in DataQuality tests

- Use `.first()` with `expect().toBeVisible()` instead of `.waitFor()` on
  multi-match locator `[data-testid="status-data-widget"]` (8 elements)
- Add `page.mouse.move(0, 0)` before sidebar hover in `sidebarClick` to
  dismiss tooltips that intercept pointer events
- Wrap all test bodies in `test.step()` for readability and granular
  failure reporting

Fixes open-metadata/openmetadata-collate#4132

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix(playwright): fix strict mode violation and tooltip intercept in DataQuality tests"

This reverts commit 644900b.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…28236)

* Fixes #28229: cascade Table certification PATCH to child search docs

When a Table's certification is added, changed, or removed via PATCH the
existing cascadeCertificationToChildren path never fired because
SearchRepository.requiresPropagation returned false on a cert-only
ChangeDescription. TableRepository did not list certification in its
propagation descriptors, so the gate stayed closed and the DQ
dashboard's Certification filter kept returning the stale cert on
test_case / test_case_result / test_case_resolution_status / test_suite /
column docs until a full reindex.

Add an EXTERNAL_HANDLER PropagationType for fields whose cascade is
driven by a dedicated SearchRepository handler instead of the generic
descriptor-driven script (cert needs full-object replace on add/update
and explicit removal on delete, which RAW_REPLACE can't express because
it restores the old value on delete). Register certification with this
type on TableRepository so the gate opens and the existing
cascadeCertificationToChildren method runs. Add no-op cases in the
three appendAdd/Update/DeleteScript switches so the new enum value
doesn't accidentally trigger generic auto-propagation.

(cherry picked from commit d6cec7e)
…n reindex jobs as failed (#28227)

* fix(search-index): stop marking clean reindex jobs as failed

A distributed search reindex was being stored with status `failed` and an
empty `failureContext` even though `failedRecords` was 0 across every stage.

Root cause: two independent entity-count implementations disagree.
`DistributedIndexingStrategy.getEntityTotal` (ListFilter(null)) seeds
`entityStats.totalRecords`, while `PartitionCalculator.getEntityCount`
(ListFilter(Include.ALL)) sizes the actual partitions. For a churny
time-series entity (e.g. testCaseResolutionStatus) the two drift — the
pre-count saw 11, the partition plan covered 9. All 9 were indexed cleanly,
but `StatsReconciler` kept the stale pre-count as the job total, producing a
phantom `total > success` gap. `hasIncompleteProcessing` escalated that gap
to `COMPLETED_WITH_ERRORS` -> `ACTIVE_ERROR`, which `OmAppJobListener`
collapsed to `FAILED`.

Changes:
- hasIncompleteProcessing now treats only `failedRecords > 0` as an error;
  a total/success gap is never a failure.
- updateEntityStats sets per-entity totalRecords from the partition plan,
  the authoritative "what we process", so total and success agree.
- getEntityTotal's time-series path uses ListFilter(Include.ALL) to match
  PartitionCalculator at the source.
- Thread warningRecords end-to-end (SearchIndexJob.EntityTypeStats, the
  coordinator, the stats mapper/aggregator, StatsReconciler) so warnings are
  counted instead of silently dropped — and never counted as failures.
- Record stale-relationship orphans (READER_RELATIONSHIP_WARNING) in the
  search_index_failures table for operator visibility; countFailuresByJobId
  excludes them so failureRecordCount stays a failure count.

Fixes open-metadata/openmetadata-collate#4099

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search-index): include warnings in processedRecords and reconciler total

Addresses PR review: making totalRecords = success + failed + warnings left
two counters out of sync.

- toEntityTypeStats / getJobWithAggregatedStats now include warnings in
  processedRecords, so getProgressPercent() reaches 100% when a job finishes
  with warnings instead of appearing stuck below 100%.
- StatsReconciler.reconcile computedTotal now includes readerWarnings, so the
  "Stats discrepancy detected" warning no longer fires when the gap is fully
  explained by stale-relationship warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 18, 2026
Comment on lines +168 to +169
if (details.getTags() != null && !details.getTags().isEmpty()) {
dp.setTags(wrapTagsAsLabels(details.getTags()));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Edge Case: ODPS tag import fails for tags not already in OpenMetadata

ODPSConverter.fromODPS maps every free-form ODPS tag string into a TagLabel with tagFQN=, source=CLASSIFICATION (wrapTagsAsLabels). These labels flow into create()/createOrUpdate(). During entity preparation, TagLabelUtil.applyTagCommonFields calls getTag(tagFQN) → Entity.getEntityByName(Entity.TAG, ...), which throws EntityNotFoundException when the tagFQN does not correspond to an existing classification tag. ODPS documents authored on a foreign system will routinely carry tags that don't exist in this OpenMetadata instance, so any such import will fail (404/400) rather than importing the product without the unknown tags. Consider resolving tags leniently (e.g. drop unknown tags, or only attach those that resolve via applyTagCommonFieldsGracefully) so a stray tag string doesn't abort the whole import.

Only import tags that resolve to existing OM tags, dropping unknown ones.:

// in fromODPS, replace dp.setTags(wrapTagsAsLabels(...)) with a filtered list
List<TagLabel> labels = wrapTagsAsLabels(details.getTags());
labels.removeIf(l -> {
  try { TagLabelUtil.applyTagCommonFields(l); return false; }
  catch (Exception e) { LOG.warn("Skipping unknown ODPS tag {}", l.getTagFQN()); return true; }
});
if (!labels.isEmpty()) dp.setTags(labels);
  • Apply fix

Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎

Comment on lines +1845 to +1859
private Response mergeOrCreateDataProductFromODPS(
UriInfo uriInfo,
SecurityContext securityContext,
ODPSDataProduct odps,
String languageCode,
String domainFqn,
String strategy) {
DataProduct imported = buildDataProductFromODPS(odps, languageCode, domainFqn);
DataProduct existing = findExistingByName(imported.getName());
DataProduct finalProduct;
if (existing == null) {
finalProduct = imported;
} else if ("replace".equalsIgnoreCase(strategy)) {
finalProduct = ODPSConverter.fullReplace(existing, imported);
} else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Bug: ODPS merge/replace silently ignores ?domain= parameter

In mergeOrCreateDataProductFromODPS the imported product is built with the domain resolved from ?domain= (buildDataProductFromODPS), but both smartMerge and fullReplace copy domains from the existing product (merged.setDomains(existing.getDomains())). As a result, when a caller does PUT /odps?domain=&strategy=replace against an existing product to move it to a different domain, the supplied domain is silently discarded. If preserving the existing domain is intentional (governance), that is fine, but the endpoint accepts and documents the domain query param, so the no-op may surprise callers. Consider documenting that domain changes are not applied on merge/replace, or honoring an explicit domain override.

Was this helpful? React with 👍 / 👎

@github-actions

Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

The previous import-modal test only covered a round-trip YAML whose name
matches the product, so it passed with either the old first-`name:` regex or
the new js-yaml parser. Add a test that pastes a YAML with a decoy top-level
`name:` matching the current product but a different real product name at
product.details.en.name — the hardened guard must read that path and block the
import (no PUT fires, modal stays open). Fails with the old regex, passes now.

9/9 ODPS tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 45c16ee)
@greptile-apps

greptile-apps Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review. (2462 files found, 100 file limit)

Comment on lines 279 to +427
exit 1

notify:
runs-on: ubuntu-latest
environment: security-scan
needs: [vulnerability-scan, security-scan]
if: always()
steps:
- name: Download Snyk artifact
if: needs.security-scan.result != 'skipped'
uses: actions/download-artifact@v4
with:
name: security-report
path: security-report
continue-on-error: true

- name: Download Retire.js Slack artifact
if: needs.vulnerability-scan.result != 'skipped'
uses: actions/download-artifact@v4
with:
name: retire-slack
path: retire-report
continue-on-error: true

- name: Build Slack header payload
id: build-header
env:
RETIRE_RESULT: ${{ needs.vulnerability-scan.result }}
SNYK_RESULT: ${{ needs.security-scan.result }}
REF_NAME: ${{ github.ref_name }}
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
python3 - <<'PY' > header.json
import json, os, pathlib

retire = os.environ["RETIRE_RESULT"]
snyk = os.environ["SNYK_RESULT"]
ref_name = os.environ["REF_NAME"]
run_url = os.environ["RUN_URL"]

def status_icon(s):
return {"success": "✅", "cancelled": "⚠️ (cancelled)", "skipped": "⚠️ (skipped)"}.get(s, "❌")

def load_counts(path):
p = pathlib.Path(path)
if not p.exists():
return None
try:
return json.loads(p.read_text())
except Exception:
return None

def totals_line(counts):
if not counts:
return ""
return (
f"\n🚨 {counts.get('critical', 0)} critical · "
f"🔴 {counts.get('high', 0)} high · "
f"🟠 {counts.get('medium', 0)} medium · "
f"🟡 {counts.get('low', 0)} low"
)

retire_counts = load_counts("retire-report/_retire_counts.json")
snyk_counts = load_counts("security-report/_counts.json")

if retire == "success" and snyk == "success":
overall = "🟢"
elif retire == "failure" or snyk == "failure":
overall = "🚨"
else:
overall = "⚠️"

header = (
f"{overall} *Security scan* — *OpenMetadata Repo*\n"
f"_branch_ `{ref_name}` · <{run_url}|Open run details>\n"
f"• Vulnerability scan (Retire.js): {status_icon(retire)}"
f"{totals_line(retire_counts)}\n"
f"• Security scan (Snyk): {status_icon(snyk)}"
f"{totals_line(snyk_counts)}"
)
fallback = f"{overall} Security scan — OpenMetadata Repo on {ref_name}. {run_url}"
payload = {
"text": fallback,
"blocks": [{"type": "section", "text": {"type": "mrkdwn", "text": header}}],
}
print(json.dumps(payload))
PY

- name: Send Slack — header
id: send-header
uses: slackapi/slack-github-action@v1.27.1
with:
channel-id: ${{ secrets.SLACK_CHANNEL_IDS }}
payload-file-path: header.json
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

- name: Build Slack thread payload
id: build-thread
if: always() && steps.send-header.outputs.ts != ''
env:
THREAD_TS: ${{ steps.send-header.outputs.ts }}
run: |
python3 - <<'PY' > thread.json
import json, os, pathlib

thread_ts = os.environ["THREAD_TS"]

def read_sections(path):
p = pathlib.Path(path)
if not p.exists():
return []
text = p.read_text().strip()
return [s.strip() for s in text.split("\n\n") if s.strip()]

retire_sections = read_sections("retire-report/_retire_slack.txt")
snyk_sections = read_sections("security-report/_slack.txt")

blocks = []
MAX_TEXT = 2850

def push(section):
text = section if len(section) <= MAX_TEXT else section[:MAX_TEXT].rstrip() + "\n_…truncated, see Job Summary_"
blocks.append({"type": "section", "text": {"type": "mrkdwn", "text": text}})

for section in retire_sections[:24]:
push(section)
if retire_sections and snyk_sections:
blocks.append({"type": "divider"})
for section in snyk_sections[:23]:
push(section)

if not blocks:
push("_No detailed scan output was produced._")

payload = {
"thread_ts": thread_ts,
"text": "Security scan details (thread reply)",
"blocks": blocks,
}
print(json.dumps(payload))
PY

- name: Send Slack — thread reply
if: always() && steps.send-header.outputs.ts != '' && steps.build-thread.outcome == 'success'
uses: slackapi/slack-github-action@v1.27.1
with:
channel-id: ${{ secrets.SLACK_CHANNEL_IDS }}
payload-file-path: thread.json
# jsonify pins Content-Type to application/json so callers that
# interpolate user-controlled values into the payload cannot have
# their response rendered as HTML.
return make_response(jsonify(response_obj), status)
Comment on lines +375 to +389
let putFired = false;
const listener = (r: import('@playwright/test').Response) => {
if (
r.url().includes('/dataProducts/odps/yaml') &&
r.request().method() === 'PUT'
) {
putFired = true;
}
};
page.on('response', listener);
await page.getByTestId('odps-import-submit').click();

// The guard short-circuits before any API call: no PUT, modal stays open.
await expect(async () => {
expect(putFired).toBe(false);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Name-guard test passes immediately, never verifies PUT is blocked

The regression test intends to prove the hardened name guard short-circuits before any API call. It does this with:

let putFired = false;
...
await page.getByTestId('odps-import-submit').click();
await expect(async () => {
  expect(putFired).toBe(false);
}).toPass({ timeout: 3000, intervals: [300] });

expect(...).toPass() resolves as soon as the inner assertion succeeds. Since putFired is false at the moment of the first poll (immediately after the click), the assertion passes instantly on the very first iteration — before the browser has any chance to issue the PUT. The timeout: 3000 / intervals: [300] therefore do nothing: the test would also pass even if the guard were broken and a PUT fired ~100ms later. The test gives false confidence and is not a real regression guard for bug #6.

To actually verify no PUT occurs, wait a fixed settle period (or race a waitForResponse against a timeout) and only then assert. For example:

await page.getByTestId('odps-import-submit').click();
// give the app time to (incorrectly) fire a PUT
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();

Alternatively assert positively that the guard surfaced a validation error in the modal, which is a stronger signal than the absence of a network call.

Wait a fixed settle period before asserting so a wrongly-fired PUT is actually observed.:

page.on('response', listener);
await page.getByTestId('odps-import-submit').click();

// Give the app time to (incorrectly) fire a PUT, then assert none did.
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
page.off('response', listener);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();
  • Apply fix

Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎

@gitar-bot

gitar-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown
Code Review ⚠️ Changes requested 0 resolved / 3 findings

Addresses several ODPS import and data-product metadata bugs, but changes are requested due to persistent issues with tag imports, domain parameter handling during merge, and incomplete regression test validation.

⚠️ Edge Case: ODPS tag import fails for tags not already in OpenMetadata

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:168-169 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:210-221 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1650-1658 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1845-1859

ODPSConverter.fromODPS maps every free-form ODPS tag string into a TagLabel with tagFQN=, source=CLASSIFICATION (wrapTagsAsLabels). These labels flow into create()/createOrUpdate(). During entity preparation, TagLabelUtil.applyTagCommonFields calls getTag(tagFQN) → Entity.getEntityByName(Entity.TAG, ...), which throws EntityNotFoundException when the tagFQN does not correspond to an existing classification tag. ODPS documents authored on a foreign system will routinely carry tags that don't exist in this OpenMetadata instance, so any such import will fail (404/400) rather than importing the product without the unknown tags. Consider resolving tags leniently (e.g. drop unknown tags, or only attach those that resolve via applyTagCommonFieldsGracefully) so a stray tag string doesn't abort the whole import.

Only import tags that resolve to existing OM tags, dropping unknown ones.
// in fromODPS, replace dp.setTags(wrapTagsAsLabels(...)) with a filtered list
List<TagLabel> labels = wrapTagsAsLabels(details.getTags());
labels.removeIf(l -> {
  try { TagLabelUtil.applyTagCommonFields(l); return false; }
  catch (Exception e) { LOG.warn("Skipping unknown ODPS tag {}", l.getTagFQN()); return true; }
});
if (!labels.isEmpty()) dp.setTags(labels);
⚠️ Bug: Name-guard test passes immediately, never verifies PUT is blocked

📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/DataProductODPS.spec.ts:375-389

The regression test intends to prove the hardened name guard short-circuits before any API call. It does this with:

let putFired = false;
...
await page.getByTestId('odps-import-submit').click();
await expect(async () => {
  expect(putFired).toBe(false);
}).toPass({ timeout: 3000, intervals: [300] });

expect(...).toPass() resolves as soon as the inner assertion succeeds. Since putFired is false at the moment of the first poll (immediately after the click), the assertion passes instantly on the very first iteration — before the browser has any chance to issue the PUT. The timeout: 3000 / intervals: [300] therefore do nothing: the test would also pass even if the guard were broken and a PUT fired ~100ms later. The test gives false confidence and is not a real regression guard for bug #6.

To actually verify no PUT occurs, wait a fixed settle period (or race a waitForResponse against a timeout) and only then assert. For example:

await page.getByTestId('odps-import-submit').click();
// give the app time to (incorrectly) fire a PUT
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();

Alternatively assert positively that the guard surfaced a validation error in the modal, which is a stronger signal than the absence of a network call.

Wait a fixed settle period before asserting so a wrongly-fired PUT is actually observed.
page.on('response', listener);
await page.getByTestId('odps-import-submit').click();

// Give the app time to (incorrectly) fire a PUT, then assert none did.
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
page.off('response', listener);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();
💡 Bug: ODPS merge/replace silently ignores ?domain= parameter

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1845-1859 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:297 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:330

In mergeOrCreateDataProductFromODPS the imported product is built with the domain resolved from ?domain= (buildDataProductFromODPS), but both smartMerge and fullReplace copy domains from the existing product (merged.setDomains(existing.getDomains())). As a result, when a caller does PUT /odps?domain=&strategy=replace against an existing product to move it to a different domain, the supplied domain is silently discarded. If preserving the existing domain is intentional (governance), that is fine, but the endpoint accepts and documents the domain query param, so the no-op may surprise callers. Consider documenting that domain changes are not applied on merge/replace, or honoring an explicit domain override.

🤖 Prompt for agents
Code Review: Addresses several ODPS import and data-product metadata bugs, but changes are requested due to persistent issues with tag imports, domain parameter handling during merge, and incomplete regression test validation.

1. ⚠️ Edge Case: ODPS tag import fails for tags not already in OpenMetadata
   Files: openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:168-169, openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:210-221, openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1650-1658, openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1845-1859

   ODPSConverter.fromODPS maps every free-form ODPS tag string into a TagLabel with tagFQN=<raw string>, source=CLASSIFICATION (wrapTagsAsLabels). These labels flow into create()/createOrUpdate(). During entity preparation, TagLabelUtil.applyTagCommonFields calls getTag(tagFQN) → Entity.getEntityByName(Entity.TAG, ...), which throws EntityNotFoundException when the tagFQN does not correspond to an existing classification tag. ODPS documents authored on a foreign system will routinely carry tags that don't exist in this OpenMetadata instance, so any such import will fail (404/400) rather than importing the product without the unknown tags. Consider resolving tags leniently (e.g. drop unknown tags, or only attach those that resolve via applyTagCommonFieldsGracefully) so a stray tag string doesn't abort the whole import.

   Fix (Only import tags that resolve to existing OM tags, dropping unknown ones.):
   // in fromODPS, replace dp.setTags(wrapTagsAsLabels(...)) with a filtered list
   List<TagLabel> labels = wrapTagsAsLabels(details.getTags());
   labels.removeIf(l -> {
     try { TagLabelUtil.applyTagCommonFields(l); return false; }
     catch (Exception e) { LOG.warn("Skipping unknown ODPS tag {}", l.getTagFQN()); return true; }
   });
   if (!labels.isEmpty()) dp.setTags(labels);

2. 💡 Bug: ODPS merge/replace silently ignores ?domain= parameter
   Files: openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1845-1859, openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:297, openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:330

   In mergeOrCreateDataProductFromODPS the imported product is built with the domain resolved from ?domain= (buildDataProductFromODPS), but both smartMerge and fullReplace copy domains from the existing product (merged.setDomains(existing.getDomains())). As a result, when a caller does PUT /odps?domain=<new>&strategy=replace against an existing product to move it to a different domain, the supplied domain is silently discarded. If preserving the existing domain is intentional (governance), that is fine, but the endpoint accepts and documents the domain query param, so the no-op may surprise callers. Consider documenting that domain changes are not applied on merge/replace, or honoring an explicit domain override.

3. ⚠️ Bug: Name-guard test passes immediately, never verifies PUT is blocked
   Files: openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/DataProductODPS.spec.ts:375-389

   The regression test intends to prove the hardened name guard short-circuits before any API call. It does this with:
   
   ```
   let putFired = false;
   ...
   await page.getByTestId('odps-import-submit').click();
   await expect(async () => {
     expect(putFired).toBe(false);
   }).toPass({ timeout: 3000, intervals: [300] });
   ```
   
   `expect(...).toPass()` resolves as soon as the inner assertion succeeds. Since `putFired` is `false` at the moment of the first poll (immediately after the click), the assertion passes instantly on the very first iteration — before the browser has any chance to issue the PUT. The `timeout: 3000` / `intervals: [300]` therefore do nothing: the test would also pass even if the guard were broken and a PUT fired ~100ms later. The test gives false confidence and is not a real regression guard for bug #6.
   
   To actually verify no PUT occurs, wait a fixed settle period (or race a `waitForResponse` against a timeout) and only then assert. For example:
   
   ```
   await page.getByTestId('odps-import-submit').click();
   // give the app time to (incorrectly) fire a PUT
   await page.waitForTimeout(2000);
   expect(putFired).toBe(false);
   await expect(page.getByTestId('odps-import-modal')).toBeVisible();
   ```
   
   Alternatively assert positively that the guard surfaced a validation error in the modal, which is a stronger signal than the absence of a network call.

   Fix (Wait a fixed settle period before asserting so a wrongly-fired PUT is actually observed.):
   page.on('response', listener);
   await page.getByTestId('odps-import-submit').click();
   
   // Give the app time to (incorrectly) fire a PUT, then assert none did.
   await page.waitForTimeout(2000);
   expect(putFired).toBe(false);
   page.off('response', listener);
   await expect(page.getByTestId('odps-import-modal')).toBeVisible();

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions

Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.