Fix AWS OpenSearch Serverless AOSS support and search cluster stability#27618
Fix AWS OpenSearch Serverless AOSS support and search cluster stability#27618aniruddhaadak80 wants to merge 12 commits intoopen-metadata:mainfrom
Conversation
…e copilot reviews
… deserialization errors
… avoid Awaitility timeouts
…s recursive=true for hard delete
…_resolution_statuses
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| int totalNodes = clusterStats != null && clusterStats.nodes() != null && clusterStats.nodes().count() != null | ||
| ? clusterStats.nodes().count().total() | ||
| : 1; | ||
| int totalShards = clusterStats != null && clusterStats.indices() != null && clusterStats.indices().shards() != null && clusterStats.indices().shards().total() != null | ||
| ? clusterStats.indices().shards().total().intValue() | ||
| : 0; | ||
|
|
||
| int maxShardsPerNode = getMaxShardsPerNode(client); |
There was a problem hiding this comment.
💡 Edge Case: SearchIndexClusterValidator also calls clusterStats but not clusterSettings
In SearchIndexClusterValidator.java, clusterStats is now null-guarded (lines 79-84), but getMaxShardsPerNode(client) is called on line 86. If that method internally calls clusterSettings() or another unsupported AOSS API, it will also fail. Worth verifying that the full code path in this validator is AOSS-safe.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review 👍 Approved with suggestions 0 resolved / 1 findingsEliminates spurious error logs in AWS OpenSearch Serverless by handling unsupported /_cluster/stats calls. Please also address the missing clusterSettings validation in SearchIndexClusterValidator. 💡 Edge Case: SearchIndexClusterValidator also calls clusterStats but not clusterSettingsIn 🤖 Prompt for agentsOptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Reduces false error logs and incorrect UNHEALTHY reporting when OpenMetadata is configured to use AWS OpenSearch Serverless (AOSS) by skipping unsupported cluster APIs and adding null-safe metric handling; also adjusts hard-delete behavior for TestCase children and adds an integration test to validate it.
Changes:
- Detect AOSS in
OpenSearchClientand propagate anisAossflag toOpenSearchGenericManager. - Skip unsupported OpenSearch endpoints on AOSS (
/_cluster/stats,/_nodes/stats, and/_cluster/health) and safely handle null cluster stats in metric/capacity calculators. - Fix TestCase hard-delete child cleanup behavior and add an IT to verify results/resolution statuses are removed.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/search/opensearch/OpenSearchGenericManager.java | Adds isAoss handling to skip unsupported APIs and uses GET / for AOSS health checks. |
| openmetadata-service/src/main/java/org/openmetadata/service/search/opensearch/OpenSearchClient.java | Implements AOSS detection and passes the flag into OpenSearchGenericManager. |
| openmetadata-service/src/main/java/org/openmetadata/service/search/SearchClusterMetrics.java | Adds null-safe defaults when clusterStats() can be null (e.g., AOSS). |
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseRepository.java | Fixes hard-delete child deletion loop and removes redundant async deletion. |
| openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexClusterValidator.java | Adds null-safe defaults for capacity computation when clusterStats() is null. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestCaseResourceIT.java | Adds an integration test ensuring hard delete removes results and resolution statuses. |
| boolean isAoss = false; | ||
| if (config != null && config.getHost() != null && config.getHost().endsWith(".aoss.amazonaws.com")) { | ||
| isAoss = true; | ||
| } else if (awsConfig != null && "aoss".equals(awsConfig.getServiceName())) { | ||
| isAoss = true; | ||
| } |
| if (isAoss) { | ||
| LOG.debug("Skipping cluster stats fetch — AWS OpenSearch Serverless does not support /_cluster/stats"); | ||
| return null; | ||
| } |
| int totalNodes = clusterStats != null && clusterStats.nodes() != null && clusterStats.nodes().count() != null | ||
| ? clusterStats.nodes().count().total() | ||
| : 1; | ||
| int totalShards = clusterStats != null && clusterStats.indices() != null && clusterStats.indices().shards() != null && clusterStats.indices().shards().total() != null | ||
| ? clusterStats.indices().shards().total().intValue() | ||
| : 0; |
| protected void deleteChildren( | ||
| List<CollectionDAO.EntityRelationshipRecord> children, boolean hardDelete, String updatedBy) { | ||
| if (hardDelete) { | ||
| TestCaseResolutionStatusRepository testCaseResolutionStatusRepository = | ||
| (TestCaseResolutionStatusRepository) | ||
| Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS); | ||
| for (CollectionDAO.EntityRelationshipRecord entityRelationshipRecord : children) { | ||
| LOG.info( | ||
| "Recursively {} deleting {} {}", | ||
| hardDelete ? "hard" : "soft", | ||
| "Recursively hard deleting {} {}", | ||
| entityRelationshipRecord.getType(), | ||
| entityRelationshipRecord.getId()); | ||
| TestCaseResolutionStatusRepository testCaseResolutionStatusRepository = | ||
| (TestCaseResolutionStatusRepository) | ||
| Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS); | ||
| for (CollectionDAO.EntityRelationshipRecord child : children) { | ||
| testCaseResolutionStatusRepository.deleteById(child.getId(), hardDelete); | ||
| } | ||
| testCaseResolutionStatusRepository.deleteById(entityRelationshipRecord.getId(), hardDelete); |
Description
This PR resolves #27599 by implementing AWS OpenSearch Serverless AOSS detection and search backend stability fixes.
Key Changes
OpenSearchClient.javato detect AOSS environments via hostname patterns or theSEARCH_AWS_SERVICE_NAMEenvironment variable.cluster statsandnodes statsinOpenSearchGenericManager.javato prevent spurious error logs in AOSS.cluster healthcalls withclient.info()(theGETroot endpoint) for AOSS deployments to ensure correct health status reporting.SearchClusterMetrics.javaandSearchIndexClusterValidator.javato handle missing cluster statistics gracefully.TestCaseResourceIT.javato handle404 Not Foundresponses when listing resolution statuses for hard-deleted test cases, improving CI reliability.