Skip to content

feat: platform secrets as sealed secrets#2978

Open
ferruhcihan wants to merge 118 commits intomainfrom
APL-523
Open

feat: platform secrets as sealed secrets#2978
ferruhcihan wants to merge 118 commits intomainfrom
APL-523

Conversation

@ferruhcihan
Copy link
Copy Markdown
Collaborator

@ferruhcihan ferruhcihan commented Feb 24, 2026

📌 Summary

PRs: apl-api | apl-tasks | apl-console

🔍 Reviewer Notes

🧹 Checklist

  • Code is readable, maintainable, and robust.
  • Unit tests added/updated

Copy link
Copy Markdown
Contributor

@CasLubbers CasLubbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work you did some nice improvements. I can understand it much better right now.

const applySealedSecretResource = async (manifest: SealedSecretManifest): Promise<void> => {
await k8s.custom().patchNamespacedCustomObject(
{
group: 'bitnami.com',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice this is already really clean! I would only make constants out of the group, version and plural. And maybe fieldManager we also use that in the applyAsApps for ArgoCD

cert.validity.notAfter = new Date()
cert.validity.notAfter.setFullYear(cert.validity.notBefore.getFullYear() + 10)

const attrs = [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a good contender to move out of the function and create a constant with good naming

]
cert.setSubject(attrs)
cert.setIssuer(attrs)
cert.setExtensions([
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here this object can get out of this function and get a good name

* - apps.X -> X-secrets
* - topLevel -> topLevel-secrets
*/
const deriveSecretName = (secretPath: string): string => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one! this was that hardcoded map right?


// Read all YAML files in the sealedsecrets subdirectory
const files = await deps.readdir(sealedSecretsDir)
for (const file of files) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This for loop can also be in a separate function with something like: applySealedSecretsFromFiles? Then you'll directly see what it is and does

}
}

public async applyRecoveryManifests(): Promise<void> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a bit tricky. It will work for now, but what if we want to add other items then secrets to the recovery?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates APL Core’s platform secret handling from SOPS-encrypted “secrets.*.yaml” files to a SealedSecrets + External Secrets Operator (ESO) model, updating Helmfile/values templates and installer flows accordingly.

Changes:

  • Introduces ESO and a ClusterSecretStore that sources secrets from the apl-secrets namespace, with app templates updated to consume secrets via existingSecret / secretKeyRef.
  • Adds SealedSecrets bootstrapping/migration logic (incl. recovery support) and removes most SOPS wiring from bootstrap/install/values writing.
  • Updates fixtures and schema/values-change versioning to reflect the new secret storage approach.

Reviewed changes

Copilot reviewed 141 out of 141 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
versions.yaml Pins linked repos to APL-523 branches.
values/prometheus-operator/prometheus-operator.gotmpl Switches Alertmanager/Grafana to use existing secrets/env-based secret injection.
values/prometheus-operator/prometheus-operator-raw.gotmpl Adds ESO ExternalSecrets for Grafana/Alertmanager/Prometheus RW credentials.
values/otomi-api/otomi-api.gotmpl Switches git password usage to existingSecret.
values/otomi-api/otomi-api-raw.gotmpl Adds ESO ExternalSecret for otomi-api git credentials.
values/oauth2-proxy/oauth2-proxy.gotmpl Moves Redis password to existingSecret.
values/oauth2-proxy/oauth2-proxy-raw.gotmpl Creates ESO ExternalSecrets for oauth2-proxy client + Redis credentials.
values/loki/loki-raw.gotmpl Replaces inline Secrets with ESO ExternalSecrets for auth config and object storage.
values/keycloak/keycloak-raw.gotmpl Replaces initial admin/linode creds Secrets with ESO ExternalSecrets.
values/k8s/k8s-raw.gotmpl Adds CoreDNS custom ConfigMap for hairpin avoidance (conditional).
values/ingress-nginx/ingress-nginx-raw.gotmpl Adds ClusterRole(+Binding) for cross-namespace TLS secret reads.
values/harbor/harbor.gotmpl Uses existing CSRF secret and injects registry username via SecretRef env.
values/harbor/harbor-raw.gotmpl Converts Harbor-related Secrets to ESO ExternalSecrets.
values/gitea/gitea.gotmpl Moves SMTP password to env SecretRef and uses existing admin secret.
values/gitea/gitea-raw.gotmpl Adds ESO ExternalSecrets for Gitea admin, SMTP, and linode creds.
values/gitea-db-secret/gitea-db-secret-raw.gotmpl Converts Gitea DB secret to ESO ExternalSecret.
values/external-secrets/external-secrets.gotmpl Adds values template for the ESO chart.
values/external-secrets/external-secrets-raw.gotmpl Adds SA/RBAC and ClusterSecretStore for in-cluster secret sourcing.
values/external-dns/external-dns.gotmpl Moves Akamai tokens to SecretRefs; normalizes google secret name.
values/external-dns/external-dns-raw.gotmpl Converts DNS provider Secrets to ESO ExternalSecrets.
values/cert-manager/cert-manager-raw.gotmpl Converts external-dns + CA material + BYO TLS secret to ESO ExternalSecrets.
values/argocd/argocd.gotmpl Removes direct OIDC clientSecret injection from values.
values/argocd/argocd-raw.gotmpl Converts ArgoCD repo creds to ESO ExternalSecrets; adds OIDC secret merge ExternalSecret.
values/apl-operator/apl-operator.gotmpl Removes git password from values; operator reads from K8s secret.
values/apl-operator/apl-operator-raw.gotmpl Adds ESO ExternalSecret for apl-git-credentials.
values/apl-keycloak-operator/apl-keycloak-operator-raw.gotmpl Converts operator secret to ESO ExternalSecret (drops inline USERS secret).
values/apl-harbor-operator/apl-harbor-operator-raw.gotmpl Converts operator secret to ESO ExternalSecret.
values/apl-gitea-operator/apl-gitea-operator-raw.gotmpl Converts operator secret to ESO ExternalSecret.
values-schema.yaml Updates secret templates + adds recovery manifests schema + adjusts required secrets.
values-changes.yaml Adds specVersion 60 migration (deletes kms.sops + runs sopsMigration).
tests/fixtures/env/users/secrets.*.yaml Removes SOPS-style user secret fixtures.
tests/fixtures/env/teams/*/secrets.settings.yaml Removes SOPS-style team secret fixtures.
tests/fixtures/env/settings/secrets.*.yaml Removes SOPS-style settings secret fixtures.
tests/fixtures/env/settings/kms.yaml Updates fixture kms.yaml structure (adds clientSecret).
tests/fixtures/env/settings/versions.yaml Bumps fixture specVersion to 60.
tests/fixtures/env/manifests/**/sealedsecrets/*.yaml Adds sealed secret fixtures for apl-users and apl-secrets namespaces.
tests/fixtures/env/apps/secrets.*.yaml Removes SOPS-style app secret fixtures.
tests/fixtures/env/apps/cert-manager.yaml Moves customRootCA into non-secrets app config fixture.
tests/bootstrap/input*.yaml Removes kms.sops provider defaults from bootstrap inputs.
src/server.ts Removes SOPS-related endpoints (/init, /prepare) and schema endpoint.
src/operator/main.ts Adds recovery manifest application + re-enables env/secrets setup.
src/operator/installer.ts Adds recovery manifest apply + git verification + pending Helm release cleanup; makes SOPS optional.
src/operator/apl-operator.ts Minor import ordering/formatting change.
src/common/values.ts Stops splitting/separating secrets at write-time; updates team-secret generation.
src/common/values.test.ts Adds tests for expanded team secret path inclusion.
src/common/repo.ts Stops writing secrets files; adds loading SealedSecrets into spec.
src/common/repo.test.ts Removes tests around secrets.* file path/manifest writing.
src/common/k8s.ts Adds namespace ensure helper; improves pending helm secret deletion by revision.
src/common/k8s.test.ts Updates pending helm secret deletion expectations.
src/common/hf.ts Ensures helmfile runs from rootDir regardless of process cwd.
src/common/gitea.ts Removes global cd, uses zx cwd-scoped runner.
src/common/git-config.ts Makes getRepo async and pulls git password from K8s secret when needed.
src/common/git-config.test.ts Updates tests for async getRepo and placeholder behavior.
src/common/envalid.ts Reorders env spec; keeps SOPS_AGE_KEY optional.
src/common/constants.ts Adds constants for otomi/apl-secrets namespaces and secret names.
src/common/bootstrap.ts Makes setIdentity cwd-aware; strips secrets before writing defaults; uses async getRepo.
src/cmd/validate-values.ts Removes x-secret fields from required during validation.
src/cmd/pull.ts Uses async getRepo and zx cwd-scoped runner.
src/cmd/migrate.ts Adds sops→sealed-secrets migration logic and SOPS artifact cleanup.
src/cmd/migrate.test.ts Adds unit tests for sopsMigration/removeSopsArtifacts.
src/cmd/install.ts Installs sealed-secrets + applies sealed manifests + installs ESO + validates git push.
src/cmd/install.test.ts Updates mocks for zx and sealed-secrets integration.
src/cmd/commit.ts Makes git operations cwd-scoped; resolves initial credentials from K8s secrets.
src/cmd/bootstrap.ts Removes SOPS bootstrap; writes secret-stripped values; bootstraps SealedSecrets manifests.
helmfile.d/snippets/grafana.gotmpl Switches OAuth client config to env var placeholders.
helmfile.d/snippets/derived.gotmpl Removes derived Harbor/Keycloak secret defaults.
helmfile.d/snippets/defaults.yaml Adds external-secrets defaults; bumps specVersion to 60.
helmfile.d/snippets/defaults.gotmpl Removes generated default secrets from dynamic defaults.
helmfile.d/snippets/alertmanager*.gotmpl Switches Slack/Opsgenie/SMTP secrets to templated secret keys.
helmfile.d/helmfile-70.shared.yaml.gotmpl Adds otomi-api artifacts raw release.
helmfile.d/helmfile-60.teams.yaml.gotmpl Uses existing secrets for team monitoring + adds per-team ExternalSecret resources.
helmfile.d/helmfile-04.init.yaml.gotmpl Adds apl-operator artifacts raw release.
helmfile.d/helmfile-01.init.yaml.gotmpl Adds external-secrets + artifacts releases.
core.yaml Adds namespaces for external-secrets and apl-secrets.
charts/otomi-api/values.yaml Adds existingSecret support; removes SOPS env secret fields.
charts/otomi-api/templates/deployment.yaml Loads envFrom from optional existingSecret.
charts/external-secrets/Chart.yaml Adds a local external-secrets chart.
charts/external-secrets/values.yaml Adds minimal ESO values.
charts/external-secrets/templates/* Adds minimal ESO Deployment/RBAC/SA templates.
charts/external-secrets/crds/* Adds ESO CRDs.
charts/apl-operator/templates/deployment.yaml Marks apl-sops-secrets secretRef optional.
chart/chart-index/Chart.yaml Adds external-secrets as a dependency in chart-index.
chart/apl/templates/sops-secrets.yaml Removes SOPS secret template.
chart/apl/templates/deployment.yaml Removes conditional envFrom block for SOPS secret.
.github/workflows/integration.yml Changes default KMS mode to no_kms.
.cspell.json Adds “bitnami” to dictionary.

Comment on lines +85 to +90
{{- $escapedDomain := $v.cluster.domainSuffix | replace "." "\\." }}
{{- if $v.apps | get "ingress-nginx-platform.enabled" false }}
rewrite name regex (.+)\.{{ $escapedDomain }} ingress-nginx-platform-controller.ingress.svc.cluster.local answer auto
{{- else }}
rewrite name regex (.+)\.{{ $escapedDomain }} {{ $v.ingress.platformClass.className }}-istio.istio-system.svc.cluster.local answer auto
{{- end }}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$v.apps | get "ingress-nginx-platform.enabled" false treats the dot as part of the key name, so this condition will never be true with the current values structure (where ingress-nginx-platform is a nested object). That will always route to the Istio service even when the nginx ingress controller is enabled.

Use dig "ingress-nginx-platform" "enabled" false, or ($v.apps | get "ingress-nginx-platform" dict).enabled to make the condition reflect the actual config structure.

Copilot uses AI. Check for mistakes.
Comment on lines +580 to +584
/**
* Read sealed secret manifests and merge their encryptedData back into the values spec.
* This restores secret values that helmfile templates need at render time.
*
* Uses the values schema (x-secret paths) to correctly map sealed secret data keys
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadSealedSecretsToSpec merges spec.encryptedData from SealedSecret manifests back into the values spec as if it were the original secret value. encryptedData is ciphertext by design, so downstream consumers that expect a real secret (e.g., for templating, migrations, or generating derived secrets) will end up using sealed ciphertext and produce invalid configs (or even double-seal values).

Instead of restoring ciphertext into .Values, either (1) stop merging SealedSecret encryptedData into the spec entirely and ensure all charts consume secrets via K8s SecretRefs/ExternalSecrets, or (2) if you need placeholders, write a deterministic placeholder (e.g. sealed:<ns>/<secret>/<key>) and teach consumers to resolve it from the cluster when needed.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +16
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: eso-core-secret-reader
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC here grants the ESO store service account get/list/watch on all Secrets cluster-wide via a ClusterRole. Since the Kubernetes provider is configured with remoteNamespace: apl-secrets, this can be scoped down to only that namespace (Role + RoleBinding in apl-secrets) to avoid over-privileging a component that can read every Secret in the cluster.

Copilot uses AI. Check for mistakes.
Comment on lines 99 to 104
podMetadata:
labels:
otomi.io/auth-policy: monitoring-{{ $teamId }}
# to do: load slackTpl and opsgenieTpl only if alerts.receicers = true
config: {{- tpl (readFile "../helmfile.d/snippets/alertmanager-teams.gotmpl") (dict "instance" $teamSettings "root" $v "slackTpl" $slackTpl "opsgenieTpl" $opsgenieTpl) | nindent 12 }}
useExistingSecret: true
configSecret: alertmanager-team-{{ $teamId }}-config
route:
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This release config now references several existingSecret / configSecret names (Grafana admin secret, Alertmanager config secret, Loki datasource password) that are created later in the same Helmfile loop by the team-secrets-{{ $teamId }} raw release. On a fresh install this can cause pods to fail to start until the ExternalSecrets have been reconciled.

Consider enforcing ordering with Helmfile needs: (e.g., make prometheus-{{ $teamId }} depend on team-secrets-{{ $teamId }}) or move the team-secrets release above the charts that consume those Secrets.

Copilot uses AI. Check for mistakes.
{{- $slackTpl := tpl (readFile "../../helmfile.d/snippets/alertmanager/slack.gotmpl") $v | toString }}
{{- $opsgenieTpl := tpl (readFile "../../helmfile.d/snippets/alertmanager/opsgenie.gotmpl") $v | toString }}
{{- $grafanaIni := tpl (readFile "../../helmfile.d/snippets/grafana.gotmpl") (dict "keycloakBase" $v._derived.oidcBaseUrl "untrustedCA" $v._derived.untrustedCA "keycloak" ($k | get "idp")) | toString }}
{{- $grafanaIni := tpl (readFile "../../helmfile.d/snippets/grafana.gotmpl") (dict "keycloakBase" $v._derived.oidcBaseUrl "untrustedCA" $v._derived.untrustedCA "keycloak" (dict "clientID" ($k | get "idp.clientID" "otomi"))) | toString }}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$k | get "idp.clientID" "otomi" treats idp.clientID as a literal key name, so it will always fall back to the default unless the map actually has a key with a dot in it. If you still need this value, use dig "idp" "clientID" "otomi" or $k.idp.clientID.

(If Grafana now always reads the client id from env/secret, you can also drop this dict field entirely to avoid dead/misleading templating.)

Copilot uses AI. Check for mistakes.
Comment on lines 114 to 118
/**
* Writes new values to the repo. Will keep the original values if `overwrite` is `false`.
* Secret values are written as-is — they are protected by SealedSecrets on the cluster side,
* and child secrets are derived via ESO ExternalSecret CRs.
*/
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says "Secret values are written as-is", but writeValues() now calls saveValues(env.ENV_DIR, inValues, {}) and the rest of the PR strips secrets before persisting values. This comment is misleading and should be updated to reflect the new behavior (secrets are stored via SealedSecrets/ESO, not written into the values repo).

Copilot uses AI. Check for mistakes.
Comment on lines 733 to +737

await k8s.deletePendingHelmReleases()
expect(mockGetPendingHelmReleases).toHaveBeenCalled()
expect(mockDeleteSecretForHelmRelease).toHaveBeenNthCalledWith(1, 'release-1', 'ns-1')
expect(mockDeleteSecretForHelmRelease).toHaveBeenNthCalledWith(3, 'release-2', 'ns-2')
expect(mockDeleteSecretForHelmRelease).toHaveBeenNthCalledWith(2, 'release-1', 'ns-1', 2)
expect(mockDeleteSecretForHelmRelease).toHaveBeenNthCalledWith(3, 'release-2', 'ns-2', 1)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These toHaveBeenNthCalledWith(2, ...) / (3, ...) assertions make this test order-dependent across multiple it(...) cases, because mocks/spies in this describe are not reset between tests. This can become flaky when tests are added/reordered.

Add a beforeEach(() => jest.clearAllMocks()) inside the describe('helm operations in progress check', ...), or assert against mock.calls created within this specific test only.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants