Skip to content

feat(logical-backup): configurable job history limits and TTL#3091

Open
yajo wants to merge 1 commit into
zalando:masterfrom
moduon:fix/logical-backup-job-cleanup
Open

feat(logical-backup): configurable job history limits and TTL#3091
yajo wants to merge 1 commit into
zalando:masterfrom
moduon:fix/logical-backup-job-cleanup

Conversation

@yajo
Copy link
Copy Markdown
Contributor

@yajo yajo commented May 6, 2026

This PR adds three new configuration options for logical backup cronjobs:

  • logical_backup_successful_jobs_history_limit (default: 3)
  • logical_backup_failed_jobs_history_limit (default: 3)
  • logical_backup_ttl_seconds_after_finished (default: 86400)

Problem

Currently, the postgres-operator does not configure any of these fields on the logical backup CronJob. This means:

  • Kubernetes defaults apply (3 successful, 1 failed), which may not be sufficient for clusters with many PostgreSQL instances.
  • No ttlSecondsAfterFinished is set, so completed/failed backup Jobs and their Pods accumulate indefinitely.
  • When the CronJob is recreated (e.g., after spec changes), old Jobs are orphaned and never cleaned up.

This fixes #1092.

Solution

  1. Added new config fields to the operator configuration structs (Go + CRD + Helm values).
  2. Injected the fields into the generated CronJob and JobTemplate specs.
  3. Updated the CronJob comparison logic so the operator detects changes to these fields and reconciles accordingly.
  4. Added and updated unit tests to verify the new defaults and behavior.

Files changed

  • pkg/util/config/config.go — new config fields
  • pkg/apis/acid.zalan.do/v1/operator_configuration_type.go — CRD struct fields
  • pkg/controller/operator_config.go — wiring from CRD to internal config
  • pkg/cluster/k8sres.go — CronJob generation
  • pkg/cluster/cluster.go — CronJob comparison
  • pkg/cluster/k8sres_test.go — generation tests
  • pkg/cluster/cluster_test.go — comparison tests
  • charts/postgres-operator/values.yaml — Helm values
  • charts/postgres-operator/crds/operatorconfigurations.yaml — CRD schema

Disclaimer

I am no Go programmer. This has been AI-assisted by kimi-k2.6.

@moduon

Adds three new configuration options for logical backup cronjobs:
- logical_backup_successful_jobs_history_limit (default: 3)
- logical_backup_failed_jobs_history_limit (default: 3)
- logical_backup_ttl_seconds_after_finished (default: 86400)

These options control how many completed/failed backup jobs are
retained by Kubernetes and when finished jobs are automatically
deleted. This prevents accumulation of old backup jobs and pods
in namespaces with many PostgreSQL clusters.

Also updates the CronJob comparison logic to detect changes in
these new fields and trigger reconciliation when needed.

Closes zalando#1092
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logical backup cronjob cleanup

1 participant