Skip to content

Conversation

@arepala-uml
Copy link
Contributor

@arepala-uml arepala-uml commented Nov 24, 2025

  1. Added database/sops/sops.go with encrypt/decrypt helpers using getsops/sops and AWS KMS from DINGO_AWS_KMS_KEY_ARNS.
  2. Updated Blob Store AWS GetCommitTimestamp/SetCommitTimestamp to use the new SOPS package for transparent encryption/decryption.

Closes #334


Summary by cubic

Add AWS KMS encryption via SOPS and apply it to the S3 commit timestamp blob. This secures metadata at rest and requires at least one KMS (GCP or AWS) to be configured. Closes #334.

  • New Features

    • SOPS helpers support AWS KMS (DINGO_AWS_KMS_KEY_ARNS, optional DINGO_AWS_KMS_PROFILE) alongside GCP KMS; at least one must be set.
    • BlobStore S3 encrypts/decrypts commit timestamps and auto-migrates plaintext on read; GCS now also auto-migrates plaintext on read.
  • Migration

    • Set DINGO_GCP_KMS_RESOURCE_ID and/or DINGO_AWS_KMS_KEY_ARNS.
    • If a plaintext commit timestamp exists in S3 or GCS, it will be rewritten encrypted on next read.

Written for commit 20c8256. Summary will update automatically on new commits.

Summary by CodeRabbit

  • New Features

    • Added AWS KMS support alongside GCP KMS for encryption key management, configurable via environment.
    • Encryption now uses a binary-backed workflow for stored blobs.
  • Chores

    • Commit timestamps are stored encrypted and decrypted on read, with improved error handling and logging.
    • Automatic migration for legacy plaintext timestamps found in storage — they are detected, re-encrypted, and returned.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Akhil Repala <arepala@blinklabs.io>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds SOPS-based encryption/decryption and plaintext-to-SOPS migration across blob stores. database/plugin/blob/aws/commit_timestamp.go: GetCommitTimestamp fetches ciphertext, attempts SOPS decryption, treats non-JSON decryption failures as legacy plaintext (big-endian int64), migrates by re-encrypting via SetCommitTimestamp, and returns the timestamp; SetCommitTimestamp encrypts timestamp bytes before storing. database/sops/sops.go: switches to binary SOPS stores and adds getMasterKeyGroupsFromEnv to build master key groups from environment variables (GCP/AWS KMS). database/plugin/blob/gcs/commit_timestamp.go: similar plaintext-detection and migration logic added for GCS.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas needing extra attention:

  • database/plugin/blob/aws/commit_timestamp.go — decryption flow, JSON validity check, legacy big-endian int64 parsing, and migration write correctness.
  • database/plugin/blob/gcs/commit_timestamp.go — parallel plaintext detection/migration logic and consistency with AWS implementation.
  • database/sops/sops.go — binary store usage, getMasterKeyGroupsFromEnv parsing of environment variables, construction of key groups (GCP/AWS), and error messages when no keys are configured.
  • Cross-file consistency — ensure encryption format, metadata, and error handling are compatible between AWS/GCS stores and the sops helpers.

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main objective of the PR: implementing optional Amazon KMS encryption, which directly aligns with the changes to add AWS KMS support to the SOPS encryption helpers and enable transparent encryption/decryption in the BlobStoreS3.
Linked Issues check ✅ Passed The PR successfully implements optional Amazon KMS-based encryption by adding AWS KMS support to the SOPS package and integrating transparent encryption/decryption for commit timestamps in BlobStoreS3, fully addressing issue #334's requirement for at-rest encryption using user-provided KMS keys.
Out of Scope Changes check ✅ Passed The changes are focused and on-scope: AWS KMS integration in SOPS, AWS KMS key handling via environment variables, commit timestamp encryption/decryption in S3, migration support for legacy plaintext timestamps, and similar updates to GCS. All modifications directly support the stated objective of implementing optional Amazon KMS-based encryption.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch aws_key_sops

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@arepala-uml arepala-uml marked this pull request as ready for review November 24, 2025 05:31
@arepala-uml arepala-uml requested a review from a team as a code owner November 24, 2025 05:31
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 2 files

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="database/plugin/blob/aws/commit_timestamp.go">

<violation number="1" location="database/plugin/blob/aws/commit_timestamp.go:31">
`GetCommitTimestamp` cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.</violation>
</file>

<file name="database/sops/sops.go">

<violation number="1" location="database/sops/sops.go:102">
The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.</violation>

<violation number="2" location="database/sops/sops.go:102">
Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the &quot;and/or&quot; requirement.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

if err != nil {
return 0, err
}
plaintext, err := dingosops.Decrypt(ciphertext)
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetCommitTimestamp cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.

Prompt for AI agents
Address the following comment on database/plugin/blob/aws/commit_timestamp.go at line 31:

<comment>`GetCommitTimestamp` cannot read the previously stored plaintext value anymore, so existing S3 buckets fail to start after this change. You need to detect legacy plaintext bytes (e.g., try decrypt, fall back to interpreting the raw bytes and re-encrypt) before returning an error.</comment>

<file context>
@@ -17,19 +17,35 @@ package aws
+	if err != nil {
+		return 0, err
+	}
+	plaintext, err := dingosops.Decrypt(ciphertext)
 	if err != nil {
+		b.logger.Errorf(&quot;failed to decrypt commit timestamp: %v&quot;, err)
</file context>

✅ Addressed in 579664e

keyGroups = append(keyGroups, keys)
}
} else {
return nil, errors.New(
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.

Prompt for AI agents
Address the following comment on database/sops/sops.go at line 102:

<comment>The AWS branch now returns an error whenever DINGO_AWS_KMS_KEY_ARNS is empty, so GCP-only configurations fail even though they already provide a master key, and the error references the wrong env var name.</comment>

<file context>
@@ -91,3 +84,47 @@ func Encrypt(data []byte) ([]byte, error) {
+			keyGroups = append(keyGroups, keys)
+		}
+	} else {
+		return nil, errors.New(
+			&quot;DINGO_GCP_KMS_RESOURCE_ID not set: SOPS requires at least one master key to encrypt&quot;,
+		)
</file context>

✅ Addressed in 579664e

keyGroups = append(keyGroups, keys)
}
} else {
return nil, errors.New(
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the "and/or" requirement.

Prompt for AI agents
Address the following comment on database/sops/sops.go at line 102:

<comment>Returning an error whenever DINGO_GCP_KMS_RESOURCE_ID is unset forces every deployment to configure the Google KMS key even if AWS KMS keys are present, so AWS-only setups cannot encrypt despite the &quot;and/or&quot; requirement.</comment>

<file context>
@@ -91,3 +84,47 @@ func Encrypt(data []byte) ([]byte, error) {
+			keyGroups = append(keyGroups, keys)
+		}
+	} else {
+		return nil, errors.New(
+			&quot;DINGO_GCP_KMS_RESOURCE_ID not set: SOPS requires at least one master key to encrypt&quot;,
+		)
</file context>

✅ Addressed in 579664e

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1db1871 and 250a3a4.

📒 Files selected for processing (2)
  • database/plugin/blob/aws/commit_timestamp.go (1 hunks)
  • database/sops/sops.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
database/plugin/blob/aws/commit_timestamp.go (2)
database/plugin/blob/aws/database.go (1)
  • BlobStoreS3 (44-52)
database/sops/sops.go (2)
  • Decrypt (33-39)
  • Encrypt (41-86)

Signed-off-by: Akhil Repala <arepala@blinklabs.io>
Signed-off-by: Akhil Repala <arepala@blinklabs.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
database/sops/sops.go (1)

42-88: Encryption requirement breaks GCS and S3 deployments without KMS configuration.

The issue is confirmed. SetCommitTimestamp in GCS (line 68 of database/plugin/blob/gcs/commit_timestamp.go) and S3 (line 53 of database/plugin/blob/aws/commit_timestamp.go) backends unconditionally call dingosops.Encrypt(). Since Encrypt returns an error when getMasterKeyGroupsFromEnv() finds no configured keys (lines 116-120 of database/sops/sops.go), deployments without KMS configuration cannot write commit timestamps. This breaks backward compatibility for plaintext-only deployments.

The inconsistency is notable: SQLite and Badger backends store timestamps plaintext without encryption, while GCS and S3 fail entirely without KMS keys.

To support truly optional encryption, modify:

  1. Encrypt() in database/sops/sops.go (lines 62-65): Detect missing KMS keys and return plaintext instead of error
  2. Decrypt() in database/sops/sops.go (lines 34-40): Add plaintext fallback when decrypt.Data() fails
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b39b9f5 and 579664e.

📒 Files selected for processing (3)
  • database/plugin/blob/aws/commit_timestamp.go (1 hunks)
  • database/plugin/blob/gcs/commit_timestamp.go (2 hunks)
  • database/sops/sops.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: nilaway
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
database/sops/sops.go (2)

90-123: Past issues resolved: both KMS providers now optional.

The refactored getMasterKeyGroupsFromEnv correctly implements the "at least one master key" requirement:

  • Lines 94-102: GCP KMS is optional (only processed if DINGO_GCP_KMS_RESOURCE_ID is set)
  • Lines 105-114: AWS KMS is optional (only processed if DINGO_AWS_KMS_KEY_ARNS is set)
  • Lines 116-120: Error returned only when both are unset

This fixes the previous issues where GCP-only or AWS-only configurations would fail.


34-40: Binary format correctly configured for arbitrary data.

The switch to decrypt.Data(data, "binary") is correct for handling raw timestamp bytes. SOPS binary mode still produces JSON-wrapped encrypted output (containing metadata), which allows the migration logic in commit_timestamp.go files to distinguish SOPS-encrypted data from legacy plaintext using json.Valid().

database/plugin/blob/aws/commit_timestamp.go (2)

28-48: Migration logic correctly handles legacy plaintext timestamps.

The plaintext detection approach is sound:

  • SOPS-encrypted data is always JSON (wraps metadata)
  • Legacy plaintext is raw big-endian bytes (not JSON)
  • Line 34: !json.Valid(ciphertext) correctly distinguishes these cases

The migration flow (Lines 36-43):

  • Parses plaintext timestamp
  • Re-encrypts via SetCommitTimestamp
  • Logs migration attempt
  • Returns timestamp even if migration fails (prevents read failures)

This provides a smooth upgrade path from plaintext to encrypted storage.


51-63: SetCommitTimestamp implementation is clean and well-logged.

The encryption flow is straightforward:

  • Line 52: Convert timestamp to big-endian bytes
  • Line 53: Encrypt via SOPS
  • Lines 54-57: Handle encryption failure
  • Lines 58-60: Store ciphertext
  • Line 61: Log successful write

Error handling and logging are appropriate. Note that this will fail when no KMS keys are configured (per the issue in sops.go).

database/plugin/blob/gcs/commit_timestamp.go (2)

28-60: GetCommitTimestamp mirrors AWS implementation correctly.

The GCS version uses identical plaintext detection and migration logic (Lines 44-54) as the AWS version:

  • Same json.Valid check to distinguish plaintext from SOPS data
  • Same migration path via SetCommitTimestamp
  • Same error handling (log but continue on migration failure)

This consistency is good for maintainability. The same considerations apply:

  • Depends on resolution of mandatory encryption behavior in sops.go
  • Will log migration errors repeatedly if no KMS keys configured

62-86: SetCommitTimestamp implementation is consistent with AWS version.

The encryption and storage flow matches the AWS implementation:

  • Lines 66-67: Convert timestamp to bytes
  • Lines 68-72: Encrypt with error handling
  • Lines 74-83: Write to GCS with proper cleanup
  • Line 84: Log successful write

The GCS-specific writer handling (Lines 75-83) correctly closes the writer even on write failure (Line 76). Good defensive coding.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (reviewed changes from recent commits).

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="database/plugin/blob/gcs/commit_timestamp.go">

<violation number="1" location="database/plugin/blob/gcs/commit_timestamp.go:44">
Decrypt fallback wrongly interprets any non‑JSON ciphertext as a plaintext timestamp, so corrupted SOPS blobs get turned into random commit timestamps and overwritten instead of raising an error.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

Signed-off-by: Akhil Repala <arepala@blinklabs.io>
@wolf31o2 wolf31o2 merged commit 040e5f2 into main Nov 24, 2025
13 checks passed
@wolf31o2 wolf31o2 deleted the aws_key_sops branch November 24, 2025 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement optional Amazon KMS (Key Management Service) based encryption

3 participants