-
Notifications
You must be signed in to change notification settings - Fork 6
feat(aws): Implement optional Amazon KMS (Key Management Service) based encryption #1052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Akhil Repala <arepala@blinklabs.io>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds SOPS-based encryption/decryption and plaintext-to-SOPS migration across blob stores. database/plugin/blob/aws/commit_timestamp.go: GetCommitTimestamp fetches ciphertext, attempts SOPS decryption, treats non-JSON decryption failures as legacy plaintext (big-endian int64), migrates by re-encrypting via SetCommitTimestamp, and returns the timestamp; SetCommitTimestamp encrypts timestamp bytes before storing. database/sops/sops.go: switches to binary SOPS stores and adds getMasterKeyGroupsFromEnv to build master key groups from environment variables (GCP/AWS KMS). database/plugin/blob/gcs/commit_timestamp.go: similar plaintext-detection and migration logic added for GCS. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Areas needing extra attention:
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 2 files
Prompt for AI agents (all 3 issues)
Understand the root cause of the following 3 issues and fix them.
<file name="database/plugin/blob/aws/commit_timestamp.go">
<violation number="1" location="database/plugin/blob/aws/commit_timestamp.go:31">
`GetCommitTimestamp` cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.</violation>
</file>
<file name="database/sops/sops.go">
<violation number="1" location="database/sops/sops.go:102">
The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.</violation>
<violation number="2" location="database/sops/sops.go:102">
Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the "and/or" requirement.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| if err != nil { | ||
| return 0, err | ||
| } | ||
| plaintext, err := dingosops.Decrypt(ciphertext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetCommitTimestamp cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.
Prompt for AI agents
Address the following comment on database/plugin/blob/aws/commit_timestamp.go at line 31:
<comment>`GetCommitTimestamp` cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.</comment>
<file context>
@@ -17,19 +17,35 @@ package aws
+ if err != nil {
+ return 0, err
+ }
+ plaintext, err := dingosops.Decrypt(ciphertext)
if err != nil {
+ b.logger.Errorf("failed to decrypt commit timestamp: %v", err)
</file context>
✅ Addressed in 579664e
database/sops/sops.go
Outdated
| keyGroups = append(keyGroups, keys) | ||
| } | ||
| } else { | ||
| return nil, errors.New( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.
Prompt for AI agents
Address the following comment on database/sops/sops.go at line 102:
<comment>The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.</comment>
<file context>
@@ -91,3 +84,47 @@ func Encrypt(data []byte) ([]byte, error) {
+ keyGroups = append(keyGroups, keys)
+ }
+ } else {
+ return nil, errors.New(
+ "DINGO_GCP_KMS_RESOURCE_ID not set: SOPS requires at least one master key to encrypt",
+ )
</file context>
✅ Addressed in 579664e
database/sops/sops.go
Outdated
| keyGroups = append(keyGroups, keys) | ||
| } | ||
| } else { | ||
| return nil, errors.New( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the "and/or" requirement.
Prompt for AI agents
Address the following comment on database/sops/sops.go at line 102:
<comment>Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the "and/or" requirement.</comment>
<file context>
@@ -91,3 +84,47 @@ func Encrypt(data []byte) ([]byte, error) {
+ keyGroups = append(keyGroups, keys)
+ }
+ } else {
+ return nil, errors.New(
+ "DINGO_GCP_KMS_RESOURCE_ID not set: SOPS requires at least one master key to encrypt",
+ )
</file context>
✅ Addressed in 579664e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
database/plugin/blob/aws/commit_timestamp.go(1 hunks)database/sops/sops.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
database/plugin/blob/aws/commit_timestamp.go (2)
database/plugin/blob/aws/database.go (1)
BlobStoreS3(44-52)database/sops/sops.go (2)
Decrypt(33-39)Encrypt(41-86)
Signed-off-by: Akhil Repala <arepala@blinklabs.io>
Signed-off-by: Akhil Repala <arepala@blinklabs.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
database/sops/sops.go (1)
42-88: Encryption requirement breaks GCS and S3 deployments without KMS configuration.The issue is confirmed.
SetCommitTimestampin GCS (line 68 ofdatabase/plugin/blob/gcs/commit_timestamp.go) and S3 (line 53 ofdatabase/plugin/blob/aws/commit_timestamp.go) backends unconditionally calldingosops.Encrypt(). SinceEncryptreturns an error whengetMasterKeyGroupsFromEnv()finds no configured keys (lines 116-120 ofdatabase/sops/sops.go), deployments without KMS configuration cannot write commit timestamps. This breaks backward compatibility for plaintext-only deployments.The inconsistency is notable: SQLite and Badger backends store timestamps plaintext without encryption, while GCS and S3 fail entirely without KMS keys.
To support truly optional encryption, modify:
Encrypt()indatabase/sops/sops.go(lines 62-65): Detect missing KMS keys and return plaintext instead of errorDecrypt()indatabase/sops/sops.go(lines 34-40): Add plaintext fallback whendecrypt.Data()fails
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
database/plugin/blob/aws/commit_timestamp.go(1 hunks)database/plugin/blob/gcs/commit_timestamp.go(2 hunks)database/sops/sops.go(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: nilaway
- GitHub Check: Analyze (go)
🔇 Additional comments (6)
database/sops/sops.go (2)
90-123: Past issues resolved: both KMS providers now optional.The refactored
getMasterKeyGroupsFromEnvcorrectly implements the "at least one master key" requirement:
- Lines 94-102: GCP KMS is optional (only processed if
DINGO_GCP_KMS_RESOURCE_IDis set)- Lines 105-114: AWS KMS is optional (only processed if
DINGO_AWS_KMS_KEY_ARNSis set)- Lines 116-120: Error returned only when both are unset
This fixes the previous issues where GCP-only or AWS-only configurations would fail.
34-40: Binary format correctly configured for arbitrary data.The switch to
decrypt.Data(data, "binary")is correct for handling raw timestamp bytes. SOPS binary mode still produces JSON-wrapped encrypted output (containing metadata), which allows the migration logic in commit_timestamp.go files to distinguish SOPS-encrypted data from legacy plaintext usingjson.Valid().database/plugin/blob/aws/commit_timestamp.go (2)
28-48: Migration logic correctly handles legacy plaintext timestamps.The plaintext detection approach is sound:
- SOPS-encrypted data is always JSON (wraps metadata)
- Legacy plaintext is raw big-endian bytes (not JSON)
- Line 34:
!json.Valid(ciphertext)correctly distinguishes these casesThe migration flow (Lines 36-43):
- Parses plaintext timestamp
- Re-encrypts via
SetCommitTimestamp- Logs migration attempt
- Returns timestamp even if migration fails (prevents read failures)
This provides a smooth upgrade path from plaintext to encrypted storage.
51-63: SetCommitTimestamp implementation is clean and well-logged.The encryption flow is straightforward:
- Line 52: Convert timestamp to big-endian bytes
- Line 53: Encrypt via SOPS
- Lines 54-57: Handle encryption failure
- Lines 58-60: Store ciphertext
- Line 61: Log successful write
Error handling and logging are appropriate. Note that this will fail when no KMS keys are configured (per the issue in
sops.go).database/plugin/blob/gcs/commit_timestamp.go (2)
28-60: GetCommitTimestamp mirrors AWS implementation correctly.The GCS version uses identical plaintext detection and migration logic (Lines 44-54) as the AWS version:
- Same
json.Validcheck to distinguish plaintext from SOPS data- Same migration path via
SetCommitTimestamp- Same error handling (log but continue on migration failure)
This consistency is good for maintainability. The same considerations apply:
- Depends on resolution of mandatory encryption behavior in
sops.go- Will log migration errors repeatedly if no KMS keys configured
62-86: SetCommitTimestamp implementation is consistent with AWS version.The encryption and storage flow matches the AWS implementation:
- Lines 66-67: Convert timestamp to bytes
- Lines 68-72: Encrypt with error handling
- Lines 74-83: Write to GCS with proper cleanup
- Line 84: Log successful write
The GCS-specific writer handling (Lines 75-83) correctly closes the writer even on write failure (Line 76). Good defensive coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 3 files (reviewed changes from recent commits).
Prompt for AI agents (all 1 issues)
Understand the root cause of the following 1 issues and fix them.
<file name="database/plugin/blob/gcs/commit_timestamp.go">
<violation number="1" location="database/plugin/blob/gcs/commit_timestamp.go:44">
Decrypt fallback wrongly interprets any non‑JSON ciphertext as a plaintext timestamp, so corrupted SOPS blobs get turned into random commit timestamps and overwritten instead of raising an error.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
Signed-off-by: Akhil Repala <arepala@blinklabs.io>
Closes #334
Summary by cubic
Add AWS KMS encryption via SOPS and apply it to the S3 commit timestamp blob. This secures metadata at rest and requires at least one KMS (GCP or AWS) to be configured. Closes #334.
New Features
Migration
Written for commit 20c8256. Summary will update automatically on new commits.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.