Skip to content

controllers: use single error handling for all CRs#1964

Open
AndrewChubatiuk wants to merge 2 commits intomasterfrom
controllers-use-single-error-handling
Open

controllers: use single error handling for all CRs#1964
AndrewChubatiuk wants to merge 2 commits intomasterfrom
controllers-use-single-error-handling

Conversation

@AndrewChubatiuk
Copy link
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Mar 13, 2026

Summary by cubic

Unifies error handling across all controllers and standardizes status access on CRs. Adds graceful shutdown awareness via context.WithCancelCause and operator.ErrShutdown, and requeues on non-shutdown cancellations to avoid missed reconciles.

  • Refactors

    • Replaced per-controller handlers with a single handleReconcileErr; removed handleReconcileErrWithoutStatus.
    • Moved GetStatusMetadata from Status structs to parent CRs; updated helpers (waitForStatus, status updates) and interfaces to read metadata from the CR.
    • Added GetStatus/DefaultStatusFields for VMAlertmanagerConfig, VMRule, VMUser, and scrape CRs (VMNodeScrape, VMPodScrape, VMProbe, VMScrapeConfig, VMServiceScrape, VMStaticScrape).
    • Standardized controllers to use value instances and pass pointers; aligned finalizer/defaulting and Create/Update calls.
    • VMRule and VMUser now capture JSON parse errors during unmarshal into Spec.ParsingError.
  • Bug Fixes

    • Requeue on context.Canceled unless caused by graceful shutdown (operator.ErrShutdown via context.WithCancelCause); keep NotFound silent and backoff on conflicts.
    • Consistent error events, metrics, and conflict requeues across all CRs; controllers stop on invalid specs via spec.ParsingError.
    • Surface errors from related-resource updates while continuing other items (VMAgent/VMSingle scrapes, alert/rule config updates).

Written for commit 694ce24. Summary will update on new commits.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 49 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="api/operator/v1beta1/vmrule_types.go">

<violation number="1" location="api/operator/v1beta1/vmrule_types.go:249">
P1: Returning nil here hides VMRule JSON decode failures from admission; malformed rules can now be accepted and only fail later in reconciliation.</violation>
</file>

<file name="internal/controller/operator/vmrule_controller.go">

<violation number="1" location="internal/controller/operator/vmrule_controller.go:122">
P1: Don't only log configmap update failures here. A later successful iteration clears the shared named `err`, so this reconcile can return success after partially failing to update `VMAlert` configmaps.</violation>
</file>

<file name="internal/controller/operator/vmuser_controller.go">

<violation number="1" location="internal/controller/operator/vmuser_controller.go:131">
P1: Propagate this `CreateOrUpdateConfig` failure instead of only logging it; otherwise the reconcile returns success and controller-runtime will not retry the VMUser update.</violation>
</file>

<file name="internal/controller/operator/vmalertmanagerconfig_controller.go">

<violation number="1" location="internal/controller/operator/vmalertmanagerconfig_controller.go:75">
P2: Treat `ParsingError` as non-retryable here; returning it through `handleReconcileErr` will put malformed configs into a permanent requeue/event loop.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch from a1f67e9 to 66678ab Compare March 13, 2026 13:59
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/controllers.go">

<violation number="1" location="internal/controller/operator/controllers.go:133">
P2: Check `context.Cause(ctx)` here instead of `err`; `WithCancelCause` stores `ErrShutdown` on the context, so graceful shutdowns will usually still be requeued.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch from 163df38 to 00bc895 Compare March 13, 2026 20:45
@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch from 00bc895 to 694ce24 Compare March 13, 2026 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants