Skip to content

Revert "Fix SCEP autorenew failing for offline hosts (#44250)"#44535

Open
mostlikelee wants to merge 1 commit intomainfrom
revert-44250-scep-autorenew
Open

Revert "Fix SCEP autorenew failing for offline hosts (#44250)"#44535
mostlikelee wants to merge 1 commit intomainfrom
revert-44250-scep-autorenew

Conversation

@mostlikelee
Copy link
Copy Markdown
Contributor

@mostlikelee mostlikelee commented Apr 30, 2026

Summary

Reverts #44250. A forward-fix PR that addresses the original issue (#44111) without the regressions described below will follow.

Related issue: #44111 (still open after this revert)

Summary by CodeRabbit

  • Bug Fixes
    • Simplified iOS/iPadOS managed certificate profile verification workflow.
    • Reduced SCEP challenge expiration time from 7 days to 1 hour.
    • Modified SCEP renewal logic to remove retry backoff behavior for failed renewals.
    • Removed a redundant database operation during certificate metadata updates.

@mostlikelee mostlikelee marked this pull request as ready for review April 30, 2026 19:41
@mostlikelee mostlikelee requested a review from a team as a code owner April 30, 2026 19:41
Copilot AI review requested due to automatic review settings April 30, 2026 19:41
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Walkthrough

This pull request removes previously implemented SCEP renewal improvements and managed certificate handling optimizations. The changes include deletion of changelog entries documenting SCEP renewal behavior, simplification of iOS/iPadOS managed certificate profile status transitions to skip the verifying state, removal of certificate metadata preservation logic during renewals, elimination of time-based retry backoff for failed certificate renewals, reduction of the one-time challenge TTL from 7 days to 1 hour, and deletion of related test coverage including challenge lifecycle tests and SCEP renewal integration tests.

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description provides the revert reason (regressions from #44250), indicates a follow-up forward-fix is planned, and references related issues (#44111), but does not follow the repository's PR template structure or checklist. Complete the repository's standard PR description template by filling out relevant checklist items (e.g., testing performed, any database/migration considerations, changes file verification).
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the primary action: reverting PR #44250 regarding SCEP autorenew for offline hosts, which matches the changeset of removing code changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch revert-44250-scep-autorenew

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/datastore/mysql/mdm.go`:
- Around line 2933-2936: The UPDATE building logic (updateQuery +
hostProfileClause with values) is incorrectly resetting any non-NULL status
(including in-flight statuses like "pending" and "verifying") to NULL; change
the WHERE clause construction so only failing/terminal statuses eligible for
retry are cleared (e.g., explicit list such as "failed", "error", or other
non-in-flight codes) and ensure in-flight statuses ("pending", "verifying") are
excluded; adjust the values slice used for the parameterized query (refer to
updateQuery, hostProfileClause, and values) accordingly and restore any prior
cooldown/filtering logic that avoided immediate re-enqueue on CA outages (e.g.,
keep failing-row cooldown checks or last_attempt timestamp conditions instead of
blanket NULLing).

In `@server/service/integration_mdm_test.go`:
- Around line 18168-18171: The test currently hardcodes an expiry offset using
time.Now().Add(-2*time.Hour) when updating the challenges table; replace that
literal with a value derived from the system TTL so expiration remains correct
if fleet.OneTimeChallengeTTL changes: compute the timestamp as
time.Now().Add(-fleet.OneTimeChallengeTTL - buffer) (use a small buffer like
time.Minute) and pass that into q.ExecContext in the mysql.ExecAdhocSQL block
that updates stmt for gotChallenge2 so the challenge is reliably expired
relative to fleet.OneTimeChallengeTTL.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 317b3b6e-2c45-4748-a929-cc48fb5aae87

📥 Commits

Reviewing files that changed from the base of the PR and between f70a02a and ae1a92b.

📒 Files selected for processing (8)
  • changes/44111-scep-autorenew-fail
  • server/datastore/mysql/apple_mdm.go
  • server/datastore/mysql/apple_mdm_test.go
  • server/datastore/mysql/challenges_test.go
  • server/datastore/mysql/host_certificates.go
  • server/datastore/mysql/mdm.go
  • server/fleet/mdm.go
  • server/service/integration_mdm_test.go
💤 Files with no reviewable changes (4)
  • server/datastore/mysql/challenges_test.go
  • server/datastore/mysql/host_certificates.go
  • changes/44111-scep-autorenew-fail
  • server/datastore/mysql/apple_mdm_test.go

Comment thread server/datastore/mysql/mdm.go
Comment thread server/service/integration_mdm_test.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reverts PR #44250’s SCEP autorenew changes to roll back the managed-cert renewal behavior (and associated tests/changelog), pending a forward-fix that avoids the reported regressions.

Changes:

  • Revert fleet.OneTimeChallengeTTL from 7 days back to 1 hour.
  • Revert MySQL MDM renewal/upsert logic changes (including removing COALESCE preservation and renewal retry/backoff/status filtering changes).
  • Remove the regression/integration tests and changes entry that were added for #44111/#44250.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
server/service/integration_mdm_test.go Reverts test logic to hard-coded 2h challenge backdate.
server/fleet/mdm.go Reverts OneTimeChallengeTTL constant back to 1 hour.
server/datastore/mysql/mdm.go Reverts managed-cert upsert and renewal selection/update behavior.
server/datastore/mysql/host_certificates.go Removes iOS managed-cert “flip verifying→verified” side-effect during cert detail updates.
server/datastore/mysql/challenges_test.go Deletes challenge lifecycle/TTL unit tests introduced in #44250.
server/datastore/mysql/apple_mdm_test.go Removes iOS managed-cert and renewal regression subtests introduced in #44250.
server/datastore/mysql/apple_mdm.go Reverts iOS/iPadOS ack behavior to always short-circuit install verifying→verified.
changes/44111-scep-autorenew-fail Removes the changelog entry added in #44250.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/datastore/mysql/mdm.go
Comment thread server/service/integration_mdm_test.go
Comment thread server/datastore/mysql/apple_mdm.go
mostlikelee added a commit that referenced this pull request Apr 30, 2026
Reapplies the three independent improvements from #44250 (reverted via #44535)
and adds an ingest-side backfill that catches the actual silent-fail mechanism
(missed toInsert matcher) without breaking the natural in-flight
synchronization between reconcile and the renewal cron.

- Bump OneTimeChallengeTTL 1h → 7d so renewals don't fail with "challenge not
  found" for offline devices that pick up the InstallProfile push days later.
- Restrict the renewal cron to settled delivery states ('verified', 'failed')
  to avoid re-firing renewal while a previous delivery is still in flight.
- Gate the new 'failed' branch on a 24h backoff so permanent render-time
  failures (CA deleted, missing IDP variables) don't loop hourly.
- Add backfillHostMDMManagedCertsFromHostCertsDB: when the toInsert matcher
  in UpdateHostCertificates misses a renewed cert (replica lag, transaction
  race, verified-without-actual-renewal), look up a matching cert in
  host_certificates by the 'fleet-<profile_uuid>' substring and populate
  hmmc. Gated by a 4h grace on hmmc.updated_at so it doesn't clobber the
  in-flight blank-out, and a monotonic-forward predicate so it's idempotent.

Does NOT reintroduce the COALESCE-preserve in BulkUpsertMDMManagedCertificates
or the iOS-only park-at-'verifying' carve-out from #44250 — those broke the
natural cron synchronization gate (reconcile NULLs hmmc → cron's HAVING IS
NOT NULL excludes the row until ingest repopulates).

Resolves #44111
@mostlikelee mostlikelee mentioned this pull request Apr 30, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.77%. Comparing base (f70a02a) to head (ae1a92b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #44535      +/-   ##
==========================================
- Coverage   66.79%   66.77%   -0.03%     
==========================================
  Files        2637     2637              
  Lines      212132   212110      -22     
  Branches     9437     9437              
==========================================
- Hits       141688   141629      -59     
- Misses      57578    57609      +31     
- Partials    12866    12872       +6     
Flag Coverage Δ
backend 68.54% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

mostlikelee added a commit that referenced this pull request Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants