Skip to content

feat: add pod_sysctls operator config option#3095

Open
csy1204 wants to merge 2 commits into
zalando:masterfrom
csy1204:feat/pod-sysctls
Open

feat: add pod_sysctls operator config option#3095
csy1204 wants to merge 2 commits into
zalando:masterfrom
csy1204:feat/pod-sysctls

Conversation

@csy1204
Copy link
Copy Markdown

@csy1204 csy1204 commented May 13, 2026

Motivation

compareStatefulSetWith in pkg/cluster/cluster.go compares the pod template's
securityContext via reflect.DeepEqual. The operator only ever populates
RunAsUser, RunAsGroup, and FSGroup on that struct (in
generatePodTemplate), so any other field that an external actor — typically a
cluster-wide mutating admission webhook — injects into the StatefulSet pod
template produces a permanent diff.

This shows up most painfully with sysctls. Many internal Kubernetes platforms
inject TCP-keepalive sysctls (e.g. net.ipv4.tcp_keepalive_time,
net.ipv4.tcp_keepalive_intvl, net.ipv4.tcp_keepalive_probes) so that idle
TCP connections through an L4 load balancer don't go stale silently. Postgres
clusters are an especially valuable target for such webhooks (pgbouncer pools
hold many long-idle connections that would otherwise meet RST on next send).

Once that webhook lands, every resync_period tick the operator sees the
"pod template security context in spec does not match the current one" diff,
issues a rolling update, and performs a Patroni switchover. The cluster never
converges.

There is no operator-side way out of this today: configKubernetes exposes
spilo_runasuser / spilo_runasgroup / spilo_fsgroup and
additional_pod_capabilities, but no way to populate sysctls on the desired
template.

Change

Adds a new pod_sysctls option to KubernetesMetaConfiguration (mirroring the
shape of additional_pod_capabilities):

# OperatorConfiguration CRD
configuration:
  kubernetes:
    pod_sysctls:
    - name: net.ipv4.tcp_keepalive_time
      value: "600"
    - name: net.ipv4.tcp_keepalive_intvl
      value: "20"
    - name: net.ipv4.tcp_keepalive_probes
      value: "3"

When set, generatePodTemplate populates pod.spec.securityContext.Sysctls
verbatim. The desired template then matches whatever a webhook would inject
(assuming the webhook is idempotent on a value already present), and the
StatefulSet comparator no longer flags a diff.

The list is applied verbatim — order and values must match the external
mutator. Empty (default) leaves Sysctls unset, preserving today's behavior.

Files

  • pkg/util/config/config.goPodSysctls []v1.Sysctl in Resources
  • pkg/apis/acid.zalan.do/v1/operator_configuration_type.go — matching CRD type
  • pkg/apis/acid.zalan.do/v1/zz_generated.deepcopy.go — DeepCopy for the slice
  • pkg/apis/acid.zalan.do/v1/crds.go,
    manifests/operatorconfiguration.crd.yaml,
    charts/postgres-operator/crds/operatorconfigurations.yaml — CRD openAPI schema
  • pkg/controller/operator_config.go — CRD → internal config mapping
  • pkg/cluster/k8sres.go — apply to securityContext.Sysctls
  • pkg/cluster/k8sres_test.goTestPodSysctls (configured + empty cases)
  • manifests/postgresql-operator-default-configuration.yaml,
    charts/postgres-operator/values.yaml,
    docs/reference/operator_parameters.md — examples and reference docs

Test

go build ./...
go vet ./...
go test ./pkg/cluster/ -run TestPodSysctls -v -count=1
=== RUN   TestPodSysctls
=== RUN   TestPodSysctls/sysctls_applied_to_pod_securityContext_when_configured
=== RUN   TestPodSysctls/sysctls_omitted_when_not_configured
--- PASS: TestPodSysctls (0.00s)
PASS
ok  	github.com/zalando/postgres-operator/pkg/cluster

Full ./pkg/cluster/, ./pkg/util/config/, ./pkg/controller/,
./pkg/apis/... test suites pass.

Scope

Available in OperatorConfiguration CRD mode only, matching the existing pattern
for complex-typed config (e.g. sidecars which is also CRD-mode only).

DCO signed off.

Adds a new `pod_sysctls` option to KubernetesMetaConfiguration that
populates `pod.spec.securityContext.sysctls` in the operator-generated
StatefulSet pod template.

Motivation
----------
Cluster-wide mutating admission webhooks (e.g. internal platform policies
that inject TCP keepalive sysctls so connections through L4 load balancers
don't go stale) modify the StatefulSet pod template after the operator
creates it. The StatefulSet comparator in `compareStatefulSetWith` uses
`reflect.DeepEqual` on `pod.spec.securityContext`, so any field the
operator does not set (like `Sysctls`) but the webhook injects produces
a permanent diff. Every sync (`resync_period`, default 30m) then sees
"pod template security context in spec does not match the current one"
and triggers a rolling update + Patroni switchover, indefinitely.

By declaring the same sysctls list in the operator config, the
operator-generated template matches the webhook-mutated cluster state
and no spurious rolling restart is triggered.

Implementation
--------------
* `pkg/util/config/config.go`: add `PodSysctls []v1.Sysctl` to the
  `Resources` config struct.
* `pkg/apis/acid.zalan.do/v1/operator_configuration_type.go`: add
  matching CRD field.
* `pkg/apis/acid.zalan.do/v1/zz_generated.deepcopy.go`: handle the new
  slice in DeepCopyInto.
* `pkg/apis/acid.zalan.do/v1/crds.go`,
  `manifests/operatorconfiguration.crd.yaml`,
  `charts/postgres-operator/crds/operatorconfigurations.yaml`: extend
  the CRD openAPI schema (array of `{name, value}` objects).
* `pkg/controller/operator_config.go`: copy the CRD field into the
  internal config struct.
* `pkg/cluster/k8sres.go`: in `generatePodTemplate`, set
  `securityContext.Sysctls = c.OpConfig.PodSysctls` when non-empty.
* `pkg/cluster/k8sres_test.go`: new `TestPodSysctls` covering both the
  configured and the empty case.
* `manifests/postgresql-operator-default-configuration.yaml`,
  `charts/postgres-operator/values.yaml`,
  `docs/reference/operator_parameters.md`: example usage and reference
  documentation.

Notes
-----
* Available in the OperatorConfiguration CRD mode only (matches the
  existing pattern for complex-typed options such as `sidecars`).
* The list is applied verbatim — the order and values must match what
  the external mutator expects, or the comparator will still flag a
  diff.

Signed-off-by: Sang Yeon Cho <sang-yeon.cho@navercorp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant