controller: post-reboot handling and deployedDigest by jlebon · Pull Request #58 · bootc-dev/bootc-operator

jlebon · 2026-06-10T16:23:51Z

Add the key missing part in the rollout logic: we now actually free reboot slots once nodes reboot into the desired image and are Ready. This allows the state machine to continue rolling out to other nodes. We also keep the deployedDigest and updateAvailable fields up to date in the pool status.

With this I think we've more or less reached MVP level for the controller part at least. 🎉 Most of the remaining work is more about richer reporting back to the pool status and strengthening error-handling.

Currently, nodeStateIdle represents three things: 1. the node is running the desired image and there's nothing to do 2. the daemon hasn't checked in at all yet (new node) 3. the daemon hasn't reacted to a desiredImage change yet The "nothing to do" case is clearly important to distinguish from the others. The name 'nodeStateIdle' also is confusing. With this change, we replace nodeStateIdle with two states: 1. A new nodeStateUpToDate for case 1 above 2. A new nodeStatePending for cases 2 and 3 above Assisted-by: Pi (Claude Opus 4.6)

Might be just me, but I find it confusing that we're testing that nodes start on B and go to A. It's more natural to expect it the other way around. Also the 'booted' and 'target' names weren't clear enough. Switch to explicit 'old' and 'new' wording.

alicefr · 2026-06-11T06:49:26Z

One question: deployedDigest is never set if any node is degraded or unclassified, even if the rollout itself completed successfully. Is that what we want? Is it a stuck node avoid the rollout to be completed? I guess this follows the same principle as a stuck node holds its reboot slot

This is the crucial next step for the controller: we actually handle nodes rebooting into the desired image and free the reboot slot so rollout can continue on to other nodes. Assisted-by: Pi (Claude Opus 4.6)

Ouch, this is a pretty obvious bug. The effect wasn't disastrous (we just waste one reconcile), but still... Thankfully new tests I'm working on caught it. Assisted-by: Pi (Claude Opus 4.6)

jlebon · 2026-06-12T02:46:31Z

One question: deployedDigest is never set if any node is degraded or unclassified, even if the rollout itself completed successfully. Is that what we want? Is it a stuck node avoid the rollout to be completed? I guess this follows the same principle as a stuck node holds its reboot slot

Yeah, thanks for raising this. deployedDigest in general I'm not quite satisfied with to be frank, and I think we should probably drop it. I filed #62 for that and for now dropped that commit from this PR.

jlebon added 2 commits June 10, 2026 11:13

alicefr reviewed Jun 11, 2026

View reviewed changes

Comment thread internal/controller/rollout_envtest_test.go

alicefr reviewed Jun 11, 2026

View reviewed changes

Comment thread internal/controller/rollout_envtest_test.go Outdated

jlebon added 2 commits June 11, 2026 15:32

controller: free reboot slots for nodes that complete their update

3c23301

This is the crucial next step for the controller: we actually handle nodes rebooting into the desired image and free the reboot slot so rollout can continue on to other nodes. Assisted-by: Pi (Claude Opus 4.6)

controller: include newly created BootcNodes in ownedSet

68bfd5b

Ouch, this is a pretty obvious bug. The effect wasn't disastrous (we just waste one reconcile), but still... Thankfully new tests I'm working on caught it. Assisted-by: Pi (Claude Opus 4.6)

jlebon force-pushed the milestone-3b-c7 branch from d5dc006 to 68bfd5b Compare June 11, 2026 20:54

alicefr approved these changes Jun 12, 2026

View reviewed changes

alicefr merged commit 55db7a1 into main Jun 12, 2026
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

controller: post-reboot handling and deployedDigest#58

controller: post-reboot handling and deployedDigest#58
alicefr merged 4 commits into
mainfrom
milestone-3b-c7

jlebon commented Jun 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

alicefr commented Jun 11, 2026 •

edited

Loading

Uh oh!

jlebon commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlebon commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alicefr commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jlebon commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jlebon commented Jun 10, 2026 •

edited

Loading

alicefr commented Jun 11, 2026 •

edited

Loading