Skip to content

[CI-1951] fix(podiprecovery): add Pod watch so recovery is level-triggered#4987

Open
coutinhop wants to merge 1 commit into
tigera:masterfrom
coutinhop:pedro-CI-1951-2
Open

[CI-1951] fix(podiprecovery): add Pod watch so recovery is level-triggered#4987
coutinhop wants to merge 1 commit into
tigera:masterfrom
coutinhop:pedro-CI-1951-2

Conversation

@coutinhop

Copy link
Copy Markdown
Member

Recovery was edge-triggered on the Node watch alone: the reconcile fired only when a node's host IPs changed. On a KubeVirt VM reboot the node's new IP is reported promptly, but the node's host-networked pods are still restarting at that instant with empty status.podIPs, so the reconcile skips them. Seconds later a surviving pod comes back reporting its old, now-stale IP (Kubernetes never refreshes status.podIPs for a surviving hostNetwork pod) — but the node's host IP has already settled, no further Node event fires, and the stale pod is never re-evaluated. The earlier autoscaler-tick approach did not have this gap because it re-checked on every tick.

Add a second watch on operator-managed host-networked Pods that re-enqueues a pod's node when the pod settles into a state where its status.podIPs can be judged — its IPs appear/change, or it becomes Ready. Both watches funnel into the same node-keyed, idempotent Reconcile, so recovery is now level-triggered on both inputs to its decision (node addresses and pod IPs) while staying event-driven — no return to polling.

The predicate is gated on the host-networked marker label plus spec.hostNetwork so event volume stays to the handful of such pods cluster-wide.

Adds unit tests for the pod-settle predicate (create/update/delete, label and hostNetwork gating, IPs-appear and became-Ready transitions, steady-state no-op) and the podToNode mapping.)

Description

Release Note

TBD

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

…gered

Recovery was edge-triggered on the Node watch alone: the reconcile
fired only when a node's host IPs changed. On a KubeVirt VM reboot
the node's new IP is reported promptly, but the node's host-networked
pods are still restarting at that instant with empty status.podIPs,
so the reconcile skips them. Seconds later a surviving pod comes back
reporting its old, now-stale IP (Kubernetes never refreshes
status.podIPs for a surviving hostNetwork pod) — but the node's host
IP has already settled, no further Node event fires, and the stale
pod is never re-evaluated. The earlier autoscaler-tick approach did
not have this gap because it re-checked on every tick.

Add a second watch on operator-managed host-networked Pods that
re-enqueues a pod's node when the pod settles into a state where its
status.podIPs can be judged — its IPs appear/change, or it becomes
Ready. Both watches funnel into the same node-keyed, idempotent
Reconcile, so recovery is now level-triggered on both inputs to its
decision (node addresses and pod IPs) while staying event-driven —
no return to polling.

The predicate is gated on the host-networked marker label plus
spec.hostNetwork so event volume stays to the handful of such pods
cluster-wide.

Adds unit tests for the pod-settle predicate (create/update/delete,
label and hostNetwork gating, IPs-appear and became-Ready
transitions, steady-state no-op) and the podToNode mapping.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants