Skip to content

Add ODPS Support for Data Products + Custom Intake Forms (1.13)#29154

Merged
harshach merged 5 commits into
1.13from
odps-data-products-intake-1.13
Jun 18, 2026
Merged

Add ODPS Support for Data Products + Custom Intake Forms (1.13)#29154
harshach merged 5 commits into
1.13from
odps-data-products-intake-1.13

Conversation

@harshach

@harshach harshach commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Cherry-picks the ODPS / Data Products + Custom Intake Forms feature (#27600) from main onto 1.13, plus fixes for several bugs the feature shipped with and a large E2E coverage expansion.

What's included

Feature port (cherry-pick of ea438e03b6)IntakeForm entity/table/repository/resource, ODPSConverter, DataProduct ODPS fields, generated schemas/types, IntakeForms designer page, ODPSImportModal, DataProductMetadataModal, ODPS data-product API.

1.13 adaptation — the original commit was built on three main-only features absent from 1.13 (the HookForm AddDomainForm migration #26951, the Request-Data-Access drawer, and task-redesign activity counts). Resolved by keeping 1.13's architecture and layering only the ODPS/intake additions:

  • AddDomainForm stays on 1.13's legacy Ant Design form; the data-product fields, intake-form required rules, and dynamic required custom-property (incl. entity-reference) fields were re-implemented on the antd FieldProp/getField pattern.
  • DataProductsDetailsPage keeps 1.13's version + the ODPS import/metadata modals; dropped the main-only DAR drawer and task-count utilities.
  • Settings: only the Intake Forms menu/route (Workflow Definitions / Task Forms don't exist in 1.13).

Bug fixes (also present on main — see companion PR)

  1. DataProduct PATCH never persisted dataProductType/visibility/portfolioPriority (consolidation reverted unrecorded fields) — added recordChange in the updater.
  2. ODPS import → 400 Unknown custom field odpsMetadata — the converter wrote a non-registered extension key; removed it (export never reads it back).
  3. ODPS import → 409 — the domain reference built from ?domain= had no id; now resolved.
  4. ODPS create-from-import → 409 — the converted entity had no id; now assigns a UUID like the normal create path.

Test coverage

  • New DataProductODPS.spec.ts (8 tests): export, validate (valid + invalid), export→rename→import round-trip, merge strategy, and the UI export-menu / import-modal / metadata-edit-modal flows.
  • IntakeForm.spec.ts: realigned to 1.13's legacy form, plus new Domain enforcement coverage (designer create, option-disabled, required-native-field block, required custom property render+block, entity-reference picker) mirroring the existing DataProduct tests.

Verified on a fresh local stack: ODPS 8/8, IntakeForm 15/15, DataProducts CRUD spot-check green; openmetadata-service compiles + ODPSConverterTest passes.

🤖 Generated with Claude Code

Greptile Summary

This PR cherry-picks the ODPS / Data Products + Custom Intake Forms feature from main to the 1.13 branch, adapting it to 1.13's Ant Design form architecture, and includes four targeted bug fixes (DataProduct PATCH not persisting ODPS scalar fields, 400 on ODPS import from unregistered extension key, 409 from domain reference without ID, and 409 from missing entity ID on create-from-import).

  • New backend: IntakeFormRepository/Resource/Mapper/Validator, ODPSConverter (ODPS v4.1 ↔ OM DataProduct), 6 ODPS import/export/validate endpoints on DataProductResource, custom entity-reference validation in EntityUtil, and DB migrations for intake_form_entity.
  • New frontend: IntakeFormsPage designer, ODPSImportModal (with proper YAML-parsed name extraction replacing the prior regex), DataProductMetadataModal, and intake-form required-field enforcement wired into AddDomainForm.
  • Bug fixes: DataProductUpdater.entitySpecificUpdate now calls recordChange for dataProductType/visibility/portfolioPriority; EXPORT_FIELDS includes extension so merge/replace preserve custom properties; domain reference is resolved to a full entity reference; imported entities receive a UUID before creation.

Confidence Score: 5/5

Safe to merge; the four stated bug fixes are correct and the new ODPS/IntakeForm code paths are well-structured with appropriate fail-closed validation.

The extension-null fix (EXPORT_FIELDS now includes extension), the domain-reference resolution, the UUID assignment on import-create, and the recordChange additions in DataProductUpdater all look correct. The IntakeFormValidator's fail-closed design and the js-yaml-based name extraction in ODPSImportModal address prior fragility. Fresh findings are confined to a tag-replacement semantic in smartMerge and a possible rename block in ensureUniquePerEntityType — neither affects correctness of the core create/update/import flows.

ODPSConverter.smartMerge (tag replacement semantics) and IntakeFormRepository.ensureUniquePerEntityType (rename guard logic) are worth a second look before this ships.

Important Files Changed

Filename Overview
openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java New utility converting between OM DataProduct and ODPS v4.1; enum mappings are complete, sanitizeEntityName is robust, smartMerge/fullReplace both preserve extension/governance fields — tag replacement semantics in smartMerge are the only notable concern.
openmetadata-service/src/main/java/org/openmetadata/service/util/IntakeFormValidator.java New governance validator; correct fail-closed design for transient repo errors; handles native and extension fields; hasMeaningfulValue correctly distinguishes empty text/arrays from 0/false.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/IntakeFormRepository.java New repository; findEnabledForEntityType correctly propagates deserialization failures; ensureUniquePerEntityType may block valid admin renames.
openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java Adds 6 ODPS endpoints; EXPORT_FIELDS includes extension; validate endpoint missing SecurityContext (prior thread); findExistingByName correctly uses EXPORT_FIELDS.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataProductRepository.java Adds IntakeFormValidator call and recordChange for ODPS scalar fields — fixes PATCH persistence bug; changes are minimal and targeted.
openmetadata-service/src/main/java/org/openmetadata/service/util/EntityUtil.java Adds JsonNode nullOrEmpty overload and entity-reference validation helpers; both ID and FQN lookup paths covered; UUID format validated before lookup.
bootstrap/sql/migrations/native/1.13.1/mysql/schemaChanges.sql Adds intake_form_entity with GENERATED columns and UNIQUE on fqnHash/entityType; idiomatic with existing migration patterns.
bootstrap/sql/migrations/native/1.13.1/postgres/schemaChanges.sql Postgres counterpart; JSONB, ->> operator, and STORED generated columns are all correct.
openmetadata-ui/src/main/resources/ui/src/components/DataProducts/ODPSImportModal/ODPSImportModal.component.tsx Uses js-yaml for name extraction; name-mismatch guard accepts displayName or internal name; merge/replace strategy selection is clean.
openmetadata-ui/src/main/resources/ui/src/rest/intakeFormsAPI.ts getIntakeFormByEntityType returns null on 404 and only re-throws other errors — prevents spurious error toasts when no form is configured.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant UI as ODPSImportModal (UI)
    participant API as DataProductResource
    participant Conv as ODPSConverter
    participant Repo as DataProductRepository
    participant IFV as IntakeFormValidator
    participant IFRepo as IntakeFormRepository

    UI->>API: "POST /odps/yaml?domain=... (YAML)"
    API->>Conv: fromODPS(odps, lang)
    Conv-->>API: DataProduct (bare, name sanitized)
    API->>API: setId(UUID.randomUUID())
    API->>API: resolveEntityReferenceByName(domain)
    API->>Repo: create(dataProduct)
    Repo->>IFV: validate(entity, dataProduct)
    IFV->>IFRepo: findEnabledForEntityType(dataProduct)
    IFRepo-->>IFV: IntakeForm or null
    alt IntakeForm configured and enabled
        IFV->>IFV: checkRequiredFields(entity, form)
        alt Missing required fields
            IFV-->>Repo: IllegalArgumentException 400
        end
    end
    Repo-->>API: DataProduct created
    API-->>UI: 200 DataProduct

    Note over UI,API: PUT /odps/yaml merge or replace
    UI->>API: "PUT /odps/yaml?strategy=merge (YAML)"
    API->>Conv: fromODPS to buildDataProductFromODPS
    API->>Repo: getByName(sanitizedName, EXPORT_FIELDS)
    alt Existing product found
        API->>Conv: smartMerge(existing, imported)
        Conv-->>API: merged id existing extension existing
        API->>Repo: createOrUpdate(merged)
    else New product
        API->>Repo: create(imported)
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant UI as ODPSImportModal (UI)
    participant API as DataProductResource
    participant Conv as ODPSConverter
    participant Repo as DataProductRepository
    participant IFV as IntakeFormValidator
    participant IFRepo as IntakeFormRepository

    UI->>API: "POST /odps/yaml?domain=... (YAML)"
    API->>Conv: fromODPS(odps, lang)
    Conv-->>API: DataProduct (bare, name sanitized)
    API->>API: setId(UUID.randomUUID())
    API->>API: resolveEntityReferenceByName(domain)
    API->>Repo: create(dataProduct)
    Repo->>IFV: validate(entity, dataProduct)
    IFV->>IFRepo: findEnabledForEntityType(dataProduct)
    IFRepo-->>IFV: IntakeForm or null
    alt IntakeForm configured and enabled
        IFV->>IFV: checkRequiredFields(entity, form)
        alt Missing required fields
            IFV-->>Repo: IllegalArgumentException 400
        end
    end
    Repo-->>API: DataProduct created
    API-->>UI: 200 DataProduct

    Note over UI,API: PUT /odps/yaml merge or replace
    UI->>API: "PUT /odps/yaml?strategy=merge (YAML)"
    API->>Conv: fromODPS to buildDataProductFromODPS
    API->>Repo: getByName(sanitizedName, EXPORT_FIELDS)
    alt Existing product found
        API->>Conv: smartMerge(existing, imported)
        Conv-->>API: merged id existing extension existing
        API->>Repo: createOrUpdate(merged)
    else New product
        API->>Repo: create(imported)
    end
Loading

Reviews (3): Last reviewed commit: "test(odps): add regression test for the ..." | Re-trigger Greptile

harshach and others added 3 commits June 17, 2026 16:13
Cherry-picked from ea438e0 (main).

The backend (IntakeForm entity/table/repository/resource, ODPSConverter,
DataProduct ODPS fields), generated schemas/types, IntakeForms designer page,
ODPSImportModal, DataProductMetadataModal, and the ODPS data-product API
applied as authored.

1.13 adaptation — the original commit was built on three main-only features
absent from 1.13; those entanglements were resolved by keeping 1.13's
architecture and layering only the ODPS/intake-form additions:

- AddDomainForm: 1.13 still uses the legacy Ant Design form (main migrated it
  to react-hook-form in #26951). The data-product fields (dataProductType,
  visibility, portfolioPriority, reviewers), intake-form required-field rules,
  and dynamic required custom-property (extension) fields were re-implemented
  on the antd FieldProp/getField pattern instead of HookForm.
- DataProductsDetailsPage: kept 1.13's version and added only the ODPS import
  and metadata-edit modals; dropped the main-only Request-Data-Access drawer
  and activity/task-count utilities it referenced.
- Settings: added only the Intake Forms menu entry/route; the Workflow
  Definitions and Task Forms entries/routes (and their components) do not
  exist in 1.13 and were excluded.
- CollectionDAO/DomainRepository: dropped main-only Announcement/TaskFormSchema
  imports and the dry-run-impact helpers (BulkResponse.withHasSideEffects /
  filterDataProductsByDomain) that bled into the conflict; only IntakeFormDAO
  was added.

Verified: openmetadata-service compiles and test-compiles; AddDomainForm unit
tests pass (13); changed UI files pass tsc/eslint/prettier; i18n locales synced.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…llow-up)

Adds a Playwright suite for the ODPS + data-product-metadata flow (which had
zero E2E coverage) and fixes four backend bugs it surfaced. All bugs are
present on main too — the feature code (ODPSConverter, DataProductRepository,
DataProductResource import handlers) is unchanged since #27600 and none were
fixed by a later commit, so these fixes should be forward-ported.

Bugs fixed:

1. DataProduct PATCH never persisted dataProductType/visibility/portfolioPriority.
   DataProductUpdater.entitySpecificUpdate() recorded changes only for name and
   domains, so change-consolidation reverted the ODPS-aligned scalar fields on
   store: the PATCH response showed the value but a subsequent GET returned null.
   The metadata-edit modal silently no-opped. Added recordChange() for the three
   fields (reviewers were already handled by updateReviewers()).

2. ODPS import always returned 400 "Unknown custom field odpsMetadata".
   ODPSConverter.fromODPS wrote the raw document into extension.odpsMetadata, but
   that key is not a registered custom property, so validateExtension rejected
   every import. toODPS never reads it back (export is driven by native fields),
   so the write was dead weight — removed it; merge/replace now preserve the
   existing product's real custom properties.

3. ODPS import returned 409 because the domain reference built from the ?domain=
   FQN had no id (the create authorization context and relationship storage need
   it). buildDataProductFromODPS now resolves the domain via
   Entity.getEntityReferenceByName.

4. ODPS create-from-import returned 409 (duplicate key) because the converter
   built a bare entity with no id, so the generated primary-key column was null.
   The normal create path sets the id via EntityMapper.copy; buildDataProductFromODPS
   now assigns a fresh UUID (merge/replace overwrite it with the existing id).

UI: completed the AddDomainForm adaptation by rendering required
entity-reference custom properties as a user/team picker (was a text fallback);
single entityReference values are unwrapped from the picker's array shape so the
API receives a bare object.

Tests:
- New playwright/e2e/Pages/DataProductODPS.spec.ts (8 tests): ODPS export,
  validate (valid + invalid), export→rename→import round-trip, merge strategy,
  and the UI export-menu / import-modal / metadata-edit-modal flows.
- IntakeForm.spec.ts: realigned AddDomainForm assertions to 1.13's legacy antd
  form (add-domain testid, MUI required-asterisk label) so the cherry-picked
  intake-form enforcement tests pass against the adapted form.

Verified on a fresh local stack: ODPS 8/8, IntakeForm 10/10, DataProducts CRUD
spot-check green; openmetadata-service compiles + ODPSConverterTest passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Intake forms support Domain, DataProduct and GlossaryTerm (all enforced by
IntakeFormValidator in their repositories), but E2E only covered DataProduct.
Mirror that coverage for Domain (AddDomainForm rendered with type=DOMAIN):

- Designer creates a Domain intake form via the UI (entityType=domain).
- "Domain" option disabled in the add menu once a form exists.
- A required native field (displayName) blocks Domain create client-side and
  the backend rejects the direct API create with 400.
- A required custom property renders on the Domain create form and blocks
  submit when left empty.
- A required entity-reference custom property renders as a user picker and the
  Domain is created with the bare-object extension payload (isolated in its own
  describe for a fresh page, same MUIUserTeamSelect reason as the DataProduct
  entity-ref test).

15/15 IntakeForm tests pass on the local stack.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@harshach harshach requested a review from a team as a code owner June 18, 2026 02:52
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

❌ PR checklist incomplete

This PR cannot be merged until the following are addressed on its linked issue:

  • No GitHub issue is linked. Link an issue in the Development section of the PR (or add Fixes #12345 to the description). For a same-org cross-repo issue, add Fixes open-metadata/<repo>#123 to the description.

The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically.

Maintainers can bypass this check by adding the skip-pr-checks label.

@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

❌ UI Checkstyle Failed

❌ ESLint + Prettier + Organise Imports (src)

One or more source files have linting or formatting issues.

Affected files
  • openmetadata-ui/src/main/resources/ui/src/locale/languages/ar-sa.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/de-de.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/es-es.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/fr-fr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/gl-es.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/he-he.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ja-jp.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ko-kr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/mr-in.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/nl-nl.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pr-pr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-br.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-pt.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ru-ru.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/th-th.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/tr-tr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-cn.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-tw.json

❌ I18n Sync

Translation locale files are out of sync with en-us.json.

Affected files
  • openmetadata-ui/src/main/resources/ui/src/locale/languages/ar-sa.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/de-de.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/es-es.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/fr-fr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/gl-es.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/he-he.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ja-jp.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ko-kr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/mr-in.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/nl-nl.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pr-pr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-br.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-pt.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/ru-ru.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/th-th.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/tr-tr.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-cn.json
    • openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-tw.json

Fix locally (fast — only checks files changed in this branch):

make ui-checkstyle-changed

@github-actions

Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Comment on lines +1813 to +1822
public Response validateODPSYaml(String yamlContent) {
try {
ODPSDataProduct odps = YAML_MAPPER.readValue(yamlContent, ODPSDataProduct.class);
ODPSConverter.validateRequiredODPSFields(odps);
JsonNode summary = summarizeODPS(odps);
return Response.ok(summary).build();
} catch (JsonProcessingException e) {
throw new IllegalArgumentException("Invalid ODPS YAML content: " + e.getMessage(), e);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Validate endpoint has no authorization context

validateODPSYaml is the only non-GET endpoint in this resource that omits @Context SecurityContext securityContext. All other mutating endpoints — including the adjacent import and create-or-update YAML endpoints — receive a SecurityContext so the caller identity can be audited. Without it, this endpoint relies entirely on upstream filter-level auth with no ability to enforce role-based checks or log the caller. If framework auth is bypassed (e.g. dev/test deployment without filter), the endpoint accepts and parses arbitrary YAML from any caller with no trace.

1. ODPS merge/replace wiped owners/domains/experts/reviewers/certification on an
   existing product. findExistingByName loaded the product with only
   "id,name,version", so the lazy relationship fields came back null;
   smartMerge/fullReplace then copied those nulls and the updater recorded their
   removal — and since domains is required, a PUT against an existing product
   either 400'd (missing domain) or stripped governance fields. Load the existing
   product with EXPORT_FIELDS (the same set used for export) so those fields are
   hydrated before merging.

2. ODPSImportModal's name guard matched the first `name:` key via regex, which in
   an ODPS document can hit an SLA dimension or tag rather than the product name
   (product.details.<lang>.name). Parse the YAML with js-yaml and read that path
   (en, else the first declared language) — robust to key order and nesting.

Extends the merge E2E test to assert domains + owners survive the merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit ac251943e3b79692c41d3e9b3daad67f90854de9)
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.19% (60949/94944) 44.64% (32678/73191) 47.44% (9765/20580)

@github-actions

Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

The previous import-modal test only covered a round-trip YAML whose name
matches the product, so it passed with either the old first-`name:` regex or
the new js-yaml parser. Add a test that pastes a YAML with a decoy top-level
`name:` matching the current product but a different real product name at
product.details.en.name — the hardened guard must read that path and block the
import (no PUT fires, modal stays open). Fails with the old regex, passes now.

9/9 ODPS tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +385 to +392
await page.getByTestId('odps-import-submit').click();

// The guard short-circuits before any API call: no PUT, modal stays open.
await expect(async () => {
expect(putFired).toBe(false);
}).toPass({ timeout: 3000, intervals: [300] });
page.off('response', listener);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();

@gitar-bot gitar-bot Bot Jun 18, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Name-guard test assertion passes instantly, weakening regression value

The toPass block at lines 388-390 asserts expect(putFired).toBe(false). Because putFired starts as false, this assertion succeeds on the very first poll (essentially immediately after submit is clicked), so the timeout: 3000 / intervals: [300] settings never come into play. toPass retries until the assertion passes — it does not keep re-checking to confirm the condition stays true for the whole window. As a result, if the name guard regressed and a PUT to /dataProducts/odps/yaml did fire (a network round-trip takes some milliseconds), the test would likely have already passed and the regression would go undetected. The test therefore provides much weaker protection than the comment implies.

To actually verify no PUT fires, wait a fixed interval before asserting (so the network call would have had time to happen), e.g. await page.waitForTimeout(2000) then expect(putFired).toBe(false). Pairing it with the existing modal-still-visible / error-message assertion makes the regression signal robust.

Wait a fixed window so a regressed PUT would have time to fire before asserting.:

page.on('response', listener);
await page.getByTestId('odps-import-submit').click();

// The guard short-circuits before any API call: give a real PUT time to
// fire, then confirm none did and the modal stayed open.
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
page.off('response', listener);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();

Was this helpful? React with 👍 / 👎

@github-actions

Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@sonarqubecloud

Copy link
Copy Markdown

@harshach harshach merged commit c8c9194 into 1.13 Jun 18, 2026
35 of 70 checks passed
@harshach harshach deleted the odps-data-products-intake-1.13 branch June 18, 2026 13:03
@gitar-bot

gitar-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown
Code Review 👍 Approved with suggestions 2 resolved / 3 findings

Implements ODPS support and Custom Intake Forms by adapting main-branch features to the 1.13 architecture, including critical bug fixes for data-product persistence. Address the validation logic in IntakeFormValidator to prevent unintended 400 errors during routine PATCH operations on legacy entities.

💡 Quality: Name-guard test assertion passes instantly, weakening regression value

📄 openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/DataProductODPS.spec.ts:385-392

The toPass block at lines 388-390 asserts expect(putFired).toBe(false). Because putFired starts as false, this assertion succeeds on the very first poll (essentially immediately after submit is clicked), so the timeout: 3000 / intervals: [300] settings never come into play. toPass retries until the assertion passes — it does not keep re-checking to confirm the condition stays true for the whole window. As a result, if the name guard regressed and a PUT to /dataProducts/odps/yaml did fire (a network round-trip takes some milliseconds), the test would likely have already passed and the regression would go undetected. The test therefore provides much weaker protection than the comment implies.

To actually verify no PUT fires, wait a fixed interval before asserting (so the network call would have had time to happen), e.g. await page.waitForTimeout(2000) then expect(putFired).toBe(false). Pairing it with the existing modal-still-visible / error-message assertion makes the regression signal robust.

Wait a fixed window so a regressed PUT would have time to fire before asserting.
page.on('response', listener);
await page.getByTestId('odps-import-submit').click();

// The guard short-circuits before any API call: give a real PUT time to
// fire, then confirm none did and the modal stayed open.
await page.waitForTimeout(2000);
expect(putFired).toBe(false);
page.off('response', listener);
await expect(page.getByTestId('odps-import-modal')).toBeVisible();
✅ 2 resolved
Bug: ODPS merge/replace wipes domains/owners on existing products

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/domains/DataProductResource.java:1845-1859 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/ODPSConverter.java:283-297
In mergeOrCreateDataProductFromODPS, the existing product is loaded via findExistingByName(name) which calls repository.getByName(null, name, repository.getFields("id,name,version")). In OpenMetadata, owners, domains, experts, reviewers, tags, and certification are lazy relationship fields that are only hydrated when named in the fields parameter — so on this sparse load they all come back null.

ODPSConverter.smartMerge and fullReplace then copy these null values into the merged entity (merged.setOwners(existing.getOwners()), merged.setDomains(existing.getDomains()), setExperts, setReviewers, setCertification, etc.). The merged product is then passed to createOrUpdate, whose updater diffs against the full original from the DB and therefore records the removal of owners, experts, reviewers, and certification — silently wiping them.

Worse, domains is required for a DataProduct: even when the caller passes ?domain=, buildDataProductFromODPS sets imported.domains, but smartMerge/fullReplace overwrite it with existing.getDomains() (null). So every PUT /v1/dataProducts/odps against an already-existing product either fails with a 400 (missing required domain) or strips the domain/owners — the exact governance fields the comments claim to preserve.

Fix: load the existing product with the full relationship field set before merging.

Edge Case: ODPS YAML name guard relies on fragile regex / line order

📄 openmetadata-ui/src/main/resources/ui/src/components/DataProducts/ODPSImportModal/ODPSImportModal.component.tsx:116-130
ODPSImportModal.handleImport guards against overwriting a sibling product by extracting the name from the pasted YAML with yamlContent.match(/^\s*name\s*:\s*["']?([^"' #]+)/m). This matches the first top-level name: key, but in an ODPS document the meaningful product name lives under product.details.<lang>.name, and there can be multiple name: keys (e.g. SLA dimensions, tags). The regex grabs whichever appears first textually, which may not be the product name, producing either false mismatch errors or a bypassed guard. Since the backend matches the target purely by the sanitized ODPS product name, this client-side check can give a false sense of safety. Consider parsing the YAML (the app already validates via the backend) or reading the name from the validated summary rather than a regex.

🤖 Prompt for agents
Code Review: Implements ODPS support and Custom Intake Forms by adapting main-branch features to the 1.13 architecture, including critical bug fixes for data-product persistence. Address the validation logic in `IntakeFormValidator` to prevent unintended 400 errors during routine PATCH operations on legacy entities.

1. 💡 Quality: Name-guard test assertion passes instantly, weakening regression value
   Files: openmetadata-ui/src/main/resources/ui/playwright/e2e/Pages/DataProductODPS.spec.ts:385-392

   The `toPass` block at lines 388-390 asserts `expect(putFired).toBe(false)`. Because `putFired` starts as `false`, this assertion succeeds on the very first poll (essentially immediately after `submit` is clicked), so the `timeout: 3000` / `intervals: [300]` settings never come into play. `toPass` retries until the assertion *passes* — it does not keep re-checking to confirm the condition stays true for the whole window. As a result, if the name guard regressed and a PUT to `/dataProducts/odps/yaml` did fire (a network round-trip takes some milliseconds), the test would likely have already passed and the regression would go undetected. The test therefore provides much weaker protection than the comment implies.
   
   To actually verify no PUT fires, wait a fixed interval before asserting (so the network call would have had time to happen), e.g. `await page.waitForTimeout(2000)` then `expect(putFired).toBe(false)`. Pairing it with the existing modal-still-visible / error-message assertion makes the regression signal robust.

   Fix (Wait a fixed window so a regressed PUT would have time to fire before asserting.):
   page.on('response', listener);
   await page.getByTestId('odps-import-submit').click();
   
   // The guard short-circuits before any API call: give a real PUT time to
   // fire, then confirm none did and the modal stayed open.
   await page.waitForTimeout(2000);
   expect(putFired).toBe(false);
   page.off('response', listener);
   await expect(page.getByTestId('odps-import-modal')).toBeVisible();

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant