[CI-1951] fix(podiprecovery): add Pod watch so recovery is level-triggered#4987
Open
coutinhop wants to merge 1 commit into
Open
[CI-1951] fix(podiprecovery): add Pod watch so recovery is level-triggered#4987coutinhop wants to merge 1 commit into
coutinhop wants to merge 1 commit into
Conversation
…gered Recovery was edge-triggered on the Node watch alone: the reconcile fired only when a node's host IPs changed. On a KubeVirt VM reboot the node's new IP is reported promptly, but the node's host-networked pods are still restarting at that instant with empty status.podIPs, so the reconcile skips them. Seconds later a surviving pod comes back reporting its old, now-stale IP (Kubernetes never refreshes status.podIPs for a surviving hostNetwork pod) — but the node's host IP has already settled, no further Node event fires, and the stale pod is never re-evaluated. The earlier autoscaler-tick approach did not have this gap because it re-checked on every tick. Add a second watch on operator-managed host-networked Pods that re-enqueues a pod's node when the pod settles into a state where its status.podIPs can be judged — its IPs appear/change, or it becomes Ready. Both watches funnel into the same node-keyed, idempotent Reconcile, so recovery is now level-triggered on both inputs to its decision (node addresses and pod IPs) while staying event-driven — no return to polling. The predicate is gated on the host-networked marker label plus spec.hostNetwork so event volume stays to the handful of such pods cluster-wide. Adds unit tests for the pod-settle predicate (create/update/delete, label and hostNetwork gating, IPs-appear and became-Ready transitions, steady-state no-op) and the podToNode mapping.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Recovery was edge-triggered on the Node watch alone: the reconcile fired only when a node's host IPs changed. On a KubeVirt VM reboot the node's new IP is reported promptly, but the node's host-networked pods are still restarting at that instant with empty status.podIPs, so the reconcile skips them. Seconds later a surviving pod comes back reporting its old, now-stale IP (Kubernetes never refreshes status.podIPs for a surviving hostNetwork pod) — but the node's host IP has already settled, no further Node event fires, and the stale pod is never re-evaluated. The earlier autoscaler-tick approach did not have this gap because it re-checked on every tick.
Add a second watch on operator-managed host-networked Pods that re-enqueues a pod's node when the pod settles into a state where its status.podIPs can be judged — its IPs appear/change, or it becomes Ready. Both watches funnel into the same node-keyed, idempotent Reconcile, so recovery is now level-triggered on both inputs to its decision (node addresses and pod IPs) while staying event-driven — no return to polling.
The predicate is gated on the host-networked marker label plus spec.hostNetwork so event volume stays to the handful of such pods cluster-wide.
Adds unit tests for the pod-settle predicate (create/update/delete, label and hostNetwork gating, IPs-appear and became-Ready transitions, steady-state no-op) and the podToNode mapping.)
Description
Release Note
For PR author
make gen-filesmake gen-versionsFor PR reviewers
A note for code reviewers - all pull requests must have the following:
kind/bugif this is a bugfix.kind/enhancementif this is a a new feature.enterpriseif this PR applies to Calico Enterprise only.