Skip to content

Core: cleanExpiredMetadata cleans unreferenced encryption keys#16395

Draft
Hugo-WB wants to merge 1 commit into
apache:mainfrom
Hugo-WB:cleanup-encryption-keys-when-cleanup-metadata
Draft

Core: cleanExpiredMetadata cleans unreferenced encryption keys#16395
Hugo-WB wants to merge 1 commit into
apache:mainfrom
Hugo-WB:cleanup-encryption-keys-when-cleanup-metadata

Conversation

@Hugo-WB
Copy link
Copy Markdown
Contributor

@Hugo-WB Hugo-WB commented May 18, 2026

Fixes part 2 of: #16352

The implementation constructs a set of "reachable"/"referenced" encryption keys-ids. Any key that is not referenced, we delete.

A key can be referenced in 2 places:

  • In the snapshot list. Under key-id for every snapshot. This references the key that was used to encrypt the manifest list for that snapshot
  • In the encrypted-by-id for another encryption key.

@github-actions github-actions Bot added the core label May 18, 2026
.map(EncryptedKey::keyId)
.filter(keyId -> !reachableKeyIds.contains(keyId))
.collect(Collectors.toSet());
encryptionKeysToRemove.forEach(updatedMetaBuilder::removeEncryptionKey);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar line to: #12670

I see 2 arguments to introduce/make RemoveEncryptionKey -> RemoveEncryptionKeys. (Bulk).

  • Performance: We've ran into performance issues server side when expiring 150k+ snapshots in a non-bulk way. Bulking the changes fixed it. Although I don't expect RemoveEncryptionKey to be as expensive of a call as RemoveSnapshot. So perhaps a premature optimisation.
  • Consistency: Consistent with the other RemovePartitionSpecs and RemoveSchemas

Curious to hear thoughts.

Sets.newConcurrentHashSet(
base.encryptionKeys().stream()
.map(EncryptedKey::encryptedById)
.filter(Objects::nonNull)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding. Couldn't figure out why encryptedById is nullable. In what cases do we expect this to be null?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant