Skip to content

Fixes #27418: optimize test case hard-delete relationship cleanup#27633

Open
Megh-Shah-08 wants to merge 3 commits intoopen-metadata:mainfrom
Megh-Shah-08:fix/cascading-testcase-status-cleanup-27418
Open

Fixes #27418: optimize test case hard-delete relationship cleanup#27633
Megh-Shah-08 wants to merge 3 commits intoopen-metadata:mainfrom
Megh-Shah-08:fix/cascading-testcase-status-cleanup-27418

Conversation

@Megh-Shah-08
Copy link
Copy Markdown
Contributor

@Megh-Shah-08 Megh-Shah-08 commented Apr 22, 2026

Describe your changes:

Fixes #27418

Summary: Optimized the cleanup of TestCaseResolutionStatus relationships during TestCase hard-deletes. Replaced an inefficient $O(N)$ sequential deletion loop with a single set-based SQL delete.

Root Cause: The repository was iterating through each relationship and deleting them one-by-one. This caused significant database round-trip overhead for test cases with large incident histories.

Changes:

  1. Refactored TestCaseResolutionStatusRepository to use batchDeleteRelationships.
  2. Integrated optimized cleanup into the TestCase hard-delete lifecycle.
  3. Applied project-wide formatting via spotless.

How I tested:

  1. Integration Test: Added TestCaseResourceIT#test_testCaseDeleteCleanup to verify total relationship cleanup (Passed).
  2. Data Persistence: Confirmed via SQL that historical time-series records are preserved for audit purposes.
  3. Manual Verification: Performed a manual hard-delete of a test case with 3+ incidents to ensure no cascading errors.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Refactoring:
    • Standardized relationship type to Relationship.PARENT_OF for TestCase to TestCaseResolutionStatus associations.
  • Database optimization:
    • Improved CollectionDAO.deleteOrphanedRelationships query performance by switching to NOT EXISTS instead of NOT IN.
  • Integration testing:
    • Added recursive=true parameter to TestCase deletion in TestCaseResourceIT to ensure proper cascading cleanup.

This will update automatically on new commits.

Copilot AI review requested due to automatic review settings April 22, 2026 15:14
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes cleanup of testCaseResolutionStatus relationships when deleting TestCase entities (and during retention cleanup) to avoid per-relationship deletion overhead and prevent orphaned relationship rows.

Changes:

  • Added a batch relationship cleanup path for all resolution-status records associated with a TestCase FQN.
  • Hooked the optimized cleanup into TestCase hard-delete lifecycle and the Data Retention app.
  • Updated time-series hard-delete behavior to also remove relationships; added an integration test covering the cleanup behavior.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java Adds bulk relationship cleanup by TestCase and changes relationship type used for linking TestCase ↔ ResolutionStatus.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseRepository.java Calls the new resolution-status relationship cleanup during TestCase entity-specific hard-delete cleanup.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java Ensures hard-deleting a time-series record also deletes its relationships.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java Adds DAO method to delete orphaned relationships based on missing “from” entity rows.
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/dataRetention/DataRetention.java Adds retention-job step to remove orphaned TestCase → ResolutionStatus relationship rows.
openmetadata-service/src/main/java/org/openmetadata/service/Entity.java Special-cases time-series entities in Entity.deleteEntity to avoid standard delete flow.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestCaseResourceIT.java Adds integration test asserting relationships are removed while time-series history remains.

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/Entity.java Outdated
Comment on lines +257 to +266
@Transaction
private void cleanOrphanedTestCaseResolutionStatusRelationships() {
LOG.info("Initiating cleanup for orphaned testCaseResolutionStatus relationships.");
executeWithStatsTracking(
"orphaned_test_case_resolution_status_relationships",
() ->
collectionDAO
.relationshipDAO()
.deleteOrphanedRelationships(
Entity.TEST_CASE, Entity.TEST_CASE_RESOLUTION_STATUS, "test_case"));
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanOrphanedTestCaseResolutionStatusRelationships is wired into executeWithStatsTracking, which is designed for batched deletes (BATCH_SIZE per call). However relationshipDAO().deleteOrphanedRelationships(...) performs an unbounded delete in a single statement, which can lead to long transactions/table locks on large entity_relationship tables. Consider implementing a batched variant (delete with LIMIT / ctid selection) or integrating this cleanup into the existing EntityRelationshipCleanupUtil batching approach so the retention job remains predictable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This currently uses a single set-based delete, which is efficient and consistent with existing cleanup patterns. We can extend it to a batched approach if needed for larger datasets in a follow-up.

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 22, 2026

Code Review 🚫 Blocked 1 resolved / 2 findings

Optimizes relationship cleanup for test case hard-deletes, but the changes to PARENT_OF and RELATED_TO mapping break existing data consistency without a required migration. The unused variable in the early return has been removed.

🚨 Bug: PARENT_OF→RELATED_TO change breaks existing data without migration

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:260 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:268 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:239-247 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:256 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:264

The relationship type between TestCase and TestCaseResolutionStatus is changed from PARENT_OF (ordinal 9) to RELATED_TO (ordinal 15) in both storeRelationship and setInheritedFields. However, there is no database migration to update the relation column in entity_relationship for existing rows.

This means:

  1. setInheritedFields() queries with Relationship.RELATED_TO but existing DB rows have PARENT_OF. Since mustHaveRelationship=true, this will throw an exception when reading any pre-existing TestCaseResolutionStatus record.
  2. New records will be stored with RELATED_TO, creating an inconsistent state where old and new records use different relationship types.

A SQL migration script is needed to update existing rows:

UPDATE entity_relationship
SET relation = 15
WHERE fromEntity = 'testCase'
  AND toEntity = 'testCaseResolutionStatus'
  AND relation = 9;
Suggested fix
Add a database migration script that updates existing PARENT_OF (9) relationships to RELATED_TO (15) for testCase→testCaseResolutionStatus pairs in the entity_relationship table.
✅ 1 resolved
Quality: Unused variable in Entity.deleteEntity early return

📄 openmetadata-service/src/main/java/org/openmetadata/service/Entity.java:774-782
The early return block for time-series entities fetches the repository into a local variable repository but never uses it. The comment says "The relationships will be cleaned up by the caller's relationshipDAO().deleteAll()" which is misleading — the caller is EntityRepository.deleteChildren which calls Entity.deleteEntity, and the early return just silently skips deletion. The dead code and misleading comment reduce clarity.

🤖 Prompt for agents
Code Review: Optimizes relationship cleanup for test case hard-deletes, but the changes to `PARENT_OF` and `RELATED_TO` mapping break existing data consistency without a required migration. The unused variable in the early return has been removed.

1. 🚨 Bug: PARENT_OF→RELATED_TO change breaks existing data without migration
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:260, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:268, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:239-247, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:256, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseResolutionStatusRepository.java:264

   The relationship type between TestCase and TestCaseResolutionStatus is changed from `PARENT_OF` (ordinal 9) to `RELATED_TO` (ordinal 15) in both `storeRelationship` and `setInheritedFields`. However, there is no database migration to update the `relation` column in `entity_relationship` for existing rows.
   
   This means:
   1. `setInheritedFields()` queries with `Relationship.RELATED_TO` but existing DB rows have `PARENT_OF`. Since `mustHaveRelationship=true`, this will throw an exception when reading any pre-existing `TestCaseResolutionStatus` record.
   2. New records will be stored with `RELATED_TO`, creating an inconsistent state where old and new records use different relationship types.
   
   A SQL migration script is needed to update existing rows:
   ```sql
   UPDATE entity_relationship
   SET relation = 15
   WHERE fromEntity = 'testCase'
     AND toEntity = 'testCaseResolutionStatus'
     AND relation = 9;
   ```

   Suggested fix:
   Add a database migration script that updates existing PARENT_OF (9) relationships to RELATED_TO (15) for testCase→testCaseResolutionStatus pairs in the entity_relationship table.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@Megh-Shah-08
Copy link
Copy Markdown
Contributor Author

Hi @ayush-shah,

I’ve addressed the majority of the Copilot and Gitar review comments, including performance improvements and backward compatibility concerns.
Could you please review the latest changes and add the "safe to test" label so CI checks can proceed?

Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

🟡 Playwright Results — all passed (11 flaky)

✅ 3700 passed · ❌ 0 failed · 🟡 11 flaky · ⏭️ 89 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 480 0 1 4
🟡 Shard 2 655 0 1 7
🟡 Shard 3 665 0 1 1
🟡 Shard 4 647 0 1 27
🟡 Shard 5 610 0 1 42
🟡 Shard 6 643 0 6 8
🟡 11 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Flow/SchemaTable.spec.ts › schema table test (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/Glossary.spec.ts › Add and Remove Assets (shard 5, 1 retry)
  • Pages/HyperlinkCustomProperty.spec.ts › should show No Data placeholder when hyperlink has no value (shard 6, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Dashboard (shard 6, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Data Model (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 23, 2026 09:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comment on lines +897 to +900
TestCaseResolutionStatusRepository testCaseResolutionStatusRepository =
(TestCaseResolutionStatusRepository)
Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS);
testCaseResolutionStatusRepository.deleteAllRelationshipsByTestCase(entityInterface.getId());
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteAllRelationshipsByTestCase(...) is likely redundant in the hard-delete path: EntityRepository.cleanup() already calls relationshipDAO().deleteAll(testCaseId, Entity.TEST_CASE), which deletes all fromEntity=testCase rows (including testCase -> testCaseResolutionStatus). If the intent is performance, this extra delete doesn’t avoid the expensive part (the pre-cleanup deleteChildren(...) traversal still runs). Consider removing this call, and if you want to avoid the recursive child traversal/logging for large incident histories, override deleteChildren(...) in TestCaseRepository to skip time-series child types and rely on cleanup() + targeted relationship cleanup instead.

Suggested change
TestCaseResolutionStatusRepository testCaseResolutionStatusRepository =
(TestCaseResolutionStatusRepository)
Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS);
testCaseResolutionStatusRepository.deleteAllRelationshipsByTestCase(entityInterface.getId());

Copilot uses AI. Check for mistakes.
if (entityType.equalsIgnoreCase(Entity.TEST_CASE_RESOLUTION_STATUS)
|| entityType.equalsIgnoreCase(Entity.TEST_CASE_RESULT)) {
// TimeSeries entities are cleaned up via entitySpecificCleanup,
// not through the standard repository delete flow.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entity.deleteEntity(...) now silently no-ops for testCaseResolutionStatus / testCaseResult. Since this is a public utility used by recursive deletion, a silent return can mask unexpected calls and make debugging difficult. Consider either (a) routing to the relevant EntityTimeSeriesRepository.deleteById(...) when hardDelete=true, or (b) logging at least a debug/warn when skipping, so callers know the entity was intentionally not deleted via the standard repository flow.

Suggested change
// not through the standard repository delete flow.
// not through the standard repository delete flow.
LOG.warn(
"Skipping standard delete flow for time series entity type {} with id {} (hardDelete={}); "
+ "cleanup is expected to happen via entitySpecificCleanup.",
entityType,
entityId,
hardDelete);

Copilot uses AI. Check for mistakes.
Comment on lines +4463 to +4465
assertTrue(
statusCount >= 3,
"There should be at least 3 relationships to resolution statuses before delete");
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-delete assertion is very loose: statusCount >= 3 will still pass if duplicate/extra testCase -> testCaseResolutionStatus relationships are accidentally created. Since this test creates exactly 3 statuses, assert the exact expected count so the test reliably catches regressions.

Suggested change
assertTrue(
statusCount >= 3,
"There should be at least 3 relationships to resolution statuses before delete");
assertEquals(
3,
statusCount,
"There should be exactly 3 relationships to resolution statuses before delete");

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhancement: Data Retention App to clean up orphaned testCase -> testCaseResolutionStatus relationships

3 participants