Fix RDS IAM Cross Account Auth and Clarify Dev Container Docs#27632
Fix RDS IAM Cross Account Auth and Clarify Dev Container Docs#27632aniruddhaadak80 wants to merge 5 commits intoopen-metadata:mainfrom
Conversation
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
There was a problem hiding this comment.
Pull request overview
This PR adds cross-account support for AWS RDS IAM auth token generation by optionally assuming an STS role, and updates developer documentation/devcontainer configs to clarify Dev Container initialization.
Changes:
- Add optional
assumeRoleArnJDBC query param support inAwsRdsDatabaseAuthenticationProviderusing STS assume-role credentials. - Document Dev Container workflows in
DEVELOPER.md, clarifying thatpost-create.shis the one-time initialization script shared by both devcontainer modes. - Add clarifying inline notes to Dev Container
postCreateCommandconfiguration.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/util/jdbi/AwsRdsDatabaseAuthenticationProvider.java | Add optional STS assume-role credentials provider for cross-account RDS IAM token generation. |
| DEVELOPER.md | Add Dev Containers section clarifying the two configs and initialization flow. |
| .devcontainer/full-stack/devcontainer.json | Add clarification near postCreateCommand (but currently via JSON comment). |
| .devcontainer/dev/devcontainer.json | Add clarification near postCreateCommand (but currently via JSON comment). |
| AwsCredentialsProvider credentialsProvider = DefaultCredentialsProvider.create(); | ||
|
|
||
| if (assumeRoleArn != null) { | ||
| StsClient stsClient = | ||
| StsClient.builder() | ||
| .region(Region.of(awsRegion)) | ||
| .credentialsProvider(credentialsProvider) | ||
| .build(); | ||
|
|
||
| AssumeRoleRequest assumeRoleRequest = | ||
| AssumeRoleRequest.builder() | ||
| .roleArn(assumeRoleArn) | ||
| .roleSessionName("OpenMetadata-RDS-IAM-Auth") | ||
| .build(); | ||
|
|
||
| credentialsProvider = | ||
| StsAssumeRoleCredentialsProvider.builder() | ||
| .stsClient(stsClient) | ||
| .refreshRequest(assumeRoleRequest) | ||
| .build(); | ||
| } |
There was a problem hiding this comment.
assumeRoleArn triggers creating a new StsClient + StsAssumeRoleCredentialsProvider on every authenticate() call. In the IAM-auth path this runs per DB connection, so this will repeatedly call STS (latency + throttling risk) and also leaves the StsClient/provider unclosed, potentially leaking HTTP resources/threads. Consider constructing and reusing an assume-role credentials provider (e.g., cached by awsRegion+assumeRoleArn or initialized once per pool) and ensuring any SDK clients/providers are closed on shutdown.
| if (assumeRoleArn != null) { | ||
| StsClient stsClient = | ||
| StsClient.builder() | ||
| .region(Region.of(awsRegion)) | ||
| .credentialsProvider(credentialsProvider) | ||
| .build(); | ||
|
|
||
| AssumeRoleRequest assumeRoleRequest = | ||
| AssumeRoleRequest.builder() | ||
| .roleArn(assumeRoleArn) | ||
| .roleSessionName("OpenMetadata-RDS-IAM-Auth") | ||
| .build(); |
There was a problem hiding this comment.
The check if (assumeRoleArn != null) will attempt an STS assume-role even when the query param is present but empty/whitespace (e.g. assumeRoleArn=), which will fail with an AWS SDK validation error. Treat blank values as “not provided” (e.g., check isBlank() and skip) or raise a clear configuration error.
| StsClient stsClient = | ||
| StsClient.builder() | ||
| .region(Region.of(awsRegion)) | ||
| .credentialsProvider(credentialsProvider) | ||
| .build(); | ||
|
|
||
| AssumeRoleRequest assumeRoleRequest = | ||
| AssumeRoleRequest.builder() | ||
| .roleArn(assumeRoleArn) | ||
| .roleSessionName("OpenMetadata-RDS-IAM-Auth") | ||
| .build(); | ||
|
|
||
| credentialsProvider = | ||
| StsAssumeRoleCredentialsProvider.builder() | ||
| .stsClient(stsClient) | ||
| .refreshRequest(assumeRoleRequest) | ||
| .build(); | ||
| } |
There was a problem hiding this comment.
With the new STS assume-role path, failures (STS call errors, invalid role ARN, missing permissions, etc.) will throw AWS SDK runtime exceptions that currently bypass the catch (MalformedURLException e) and won’t be wrapped in DatabaseAuthenticationProviderException. Consider catching broader exceptions in authenticate() and wrapping them consistently to keep error handling aligned with other providers (e.g., AzureDatabaseAuthenticationProvider).
| "version": "22.17.0" | ||
| } | ||
| }, | ||
| // Use post-create script for one-time environment initialization (ANTLR, dependencies, venv) |
There was a problem hiding this comment.
devcontainer.json is parsed by the repo’s pre-commit check-json hook (strict JSON). Adding // comments makes this file invalid JSON and will cause the hook/CI to fail. Please remove the comment or replace it with a normal JSON field (e.g., a _comment property) if you want to keep the explanation in-file.
| // Use post-create script for one-time environment initialization (ANTLR, dependencies, venv) | |
| "_comment": "Use post-create script for one-time environment initialization (ANTLR, dependencies, venv)", |
| "version": "22.17.0" | ||
| } | ||
| }, | ||
| // Use post-create script for one-time environment initialization (ANTLR, dependencies, venv) |
There was a problem hiding this comment.
devcontainer.json is parsed by the repo’s pre-commit check-json hook (strict JSON). Adding // comments makes this file invalid JSON and will cause the hook/CI to fail. Please remove the comment or replace it with a normal JSON field (e.g., a _comment property) if you want to keep the explanation in-file.
| // Use post-create script for one-time environment initialization (ANTLR, dependencies, venv) | |
| "_comment": "Use post-create script for one-time environment initialization (ANTLR, dependencies, venv)", |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
…-docs-27552-27517
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| if (CommonUtil.nullOrEmpty(awsRegion)) { | ||
| throw new DatabaseAuthenticationProviderException( | ||
| "Parameter `awsRegion` shall be provided in the jdbc url."); | ||
| } | ||
| if (CommonUtil.nullOrEmpty(allowPublicKeyRetrieval)) { | ||
| throw new DatabaseAuthenticationProviderException( | ||
| "Parameter `allowPublicKeyRetrieval` shall be provided in the jdbc url."); | ||
| } |
There was a problem hiding this comment.
CommonUtil.nullOrEmpty only checks isEmpty() and will treat whitespace-only values as present. That means values like awsRegion=%20 or assumeRoleArn=%20 will pass validation and then fail later (e.g., Region.of(" ") or STS AssumeRole with an invalid ARN), producing a harder-to-diagnose error. Consider validating these parameters with a blank-aware check (e.g., trim + empty, or StringUtils.isBlank) so whitespace-only inputs are rejected as missing.
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| private static boolean checkIsAoss(ElasticSearchConfiguration config) { | ||
| String hostConfig = config != null ? config.getHost() : null; | ||
| if (StringUtils.isBlank(hostConfig)) { | ||
| return false; | ||
| } | ||
| for (String host : hostConfig.split(",")) { | ||
| String trimmedHost = host.trim().toLowerCase(); | ||
| if (trimmedHost.isEmpty()) { | ||
| continue; | ||
| } | ||
|
|
||
| String hostname = trimmedHost; | ||
| try { | ||
| // Add protocol if missing to make URI parsing easier | ||
| String uriString = trimmedHost.contains("://") ? trimmedHost : "https://" + trimmedHost; |
There was a problem hiding this comment.
💡 Edge Case: AOSS detection misses SEARCH_AWS_SERVICE_NAME=aoss config
The checkIsAoss method only checks if the hostname ends with .aoss.amazonaws.com. The PR description mentions also checking SEARCH_AWS_SERVICE_NAME=aoss as a detection mechanism. Users connecting through a proxy, VPN endpoint, or custom DNS CNAME would have a hostname that doesn't match the .aoss.amazonaws.com suffix, causing AOSS to go undetected and cluster-level API calls to fail.
Consider also checking the AWS service name configuration (if available in ElasticSearchConfiguration or AwsConfiguration) as a secondary signal.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| @Override | ||
| protected void postDelete(TestCase entity, boolean hardDelete) { | ||
| super.postDelete(entity, hardDelete); | ||
| if (hardDelete) { | ||
| // Delete test case results and resolution statuses | ||
| Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESULT) | ||
| .delete(entity.getFullyQualifiedName()); | ||
| Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS) | ||
| .delete(entity.getFullyQualifiedName()); | ||
| } | ||
| } |
There was a problem hiding this comment.
🚨 Bug: Duplicate postDelete method will cause compilation error
The class TestCaseRepository already defines postDelete(TestCase, boolean) at line 862 (which calls updateTestSuite(testCase)). The new postDelete added at line 1534 has the identical signature. Java does not allow two methods with the same name and parameter types in the same class — this will fail to compile.
Additionally, even if one were removed, the existing postDelete at line 862 calls updateTestSuite() which is needed to keep test suite counts in sync. The new method's hard-delete cleanup logic must be merged into the existing method, not added as a separate override.
Suggested fix:
Merge the hard-delete cleanup into the existing postDelete at line 862:
@Override
protected void postDelete(TestCase testCase, boolean hardDelete) {
super.postDelete(testCase, hardDelete);
updateTestSuite(testCase);
if (hardDelete) {
Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESULT)
.delete(testCase.getFullyQualifiedName());
Entity.getEntityTimeSeriesRepository(Entity.TEST_CASE_RESOLUTION_STATUS)
.delete(testCase.getFullyQualifiedName());
}
}
Then remove the duplicate method at line 1534.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| totalShards = | ||
| (clusterStats.indices() != null && clusterStats.indices().shards() != null) | ||
| ? clusterStats.indices().shards().total().intValue() | ||
| : 0; |
There was a problem hiding this comment.
⚠️ Bug: Null check on shards().total() was dropped, risking NPE
The old code guarded against shards().total() being null before calling .intValue(). The refactored code adds null checks for indices() and shards() but drops the existing null check on total(), which returns a nullable Long. If the OpenSearch API returns a response where shards is present but total is null, this will throw a NullPointerException on .intValue().
This affects both SearchIndexClusterValidator.java and SearchClusterMetrics.java with the identical pattern.
Suggested fix:
Add back the null check on total():
totalShards =
(clusterStats.indices() != null
&& clusterStats.indices().shards() != null
&& clusterStats.indices().shards().total() != null)
? clusterStats.indices().shards().total().intValue()
: 0;
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
Code Review 🚫 Blocked 2 resolved / 5 findingsRDS IAM cross-account authentication and Dev Container documentation are updated, addressing resource leaks in StsClient. The build is currently broken due to a duplicate postDelete method and a missing null check, and AOSS detection requires additional configuration coverage. 🚨 Bug: Duplicate postDelete method will cause compilation error📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseRepository.java:1534-1544 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TestCaseRepository.java:862-866 The class Additionally, even if one were removed, the existing Suggested fix
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
This PR addresses two issues: #27552 and #27517. It implements support for the optional assumeRoleArn parameter in AwsRdsDatabaseAuthenticationProvider.java to enable cross-account IAM authentication for RDS. It also enhances the DEVELOPER.md documentation with a dedicated section for Dev Containers and adds clarifying comments to .devcontainer configs to clarify that post-create.sh is the primary initialization script.
Summary by Gitar
postDeletelogic inTestCaseRepositoryto perform hard-delete for associated results and resolution statuses.SearchIndexClusterValidatorandSearchClusterMetricsto include null-checks forclusterStatsresponses, preventing potential runtime exceptions.This will update automatically on new commits.