Skip to content

MM-68539: SLA digest fixes#999

Merged
jgheithcock merged 15 commits intomasterfrom
MM-68539-sla-digest-fixes
Apr 28, 2026
Merged

MM-68539: SLA digest fixes#999
jgheithcock merged 15 commits intomasterfrom
MM-68539-sla-digest-fixes

Conversation

@jgheithcock
Copy link
Copy Markdown
Contributor

@jgheithcock jgheithcock commented Apr 28, 2026

Summary

This branch hardens the daily PR-review SLA digest in three ways:

Coverage. The digest now scans every open non-draft PR in each configured org via a single GraphQL search, instead of running a per-user search for each connected Mattermost user (the prior approach silently skipped reviewers who hadn't connected their GitHub account). Direct user-requests and team-requests are flattened and de-duplicated; team membership is fetched once per digest run.

Correctness. The SLA clock now backfills from the PR timeline whenever the plugin doesn't have a cached review_requested webhook record — for example because the PR was opened before the plugin was installed, or because the most-recent request preceded a Mattermost restart. Without this, PRs whose pending review was actually a few days old were showing up as "86 days overdue" against the configured target. Backfilled times are written back to KV so subsequent digest runs hit the fast path.

Presentation. The digest groups overdue items into seven fixed buckets (1–3 / 4–7 / 8–14 / 15–30 / 31–90 / 91–365 / >1y), renders PR titles as markdown links (with ] / \ escaping for JIRA-prefixed names), and renders each reviewer as their actual Mattermost @-mention via getGitHubToUsernameMapping. Reviewers without a Mattermost account render as (not connected) - <login> so admins can see who's missing without producing dead @-mentions. The configured SLA target appears in the post header. The RHS panel grew per-item SLA badges (Overdue / Due today / Due in N days) sharing math with the existing sidebar red-button indicator via a new webapp/src/utils/sla module.

The final two commits in this branch decompose collectAllOverdueSLAItems into small testable helpers and fix two latent bugs uncovered while doing so:

  1. Concurrent GraphQL state. orgOpenPRsQuery was a package-level var that the graphql library decoded into. Concurrent callers — the digest scheduler running alongside an LHS fetch on a different goroutine — could clobber each other's Search.Nodes mid-decode. Now defined as a type at package scope (so the library can still reflect on its tags) but instantiated locally per call.
  2. Misleading backfill summary log. The backfill summary log was deferred inside the resolver constructor, so it fired immediately when the closure was created and always reported timeline_pages_fetched=0 kv_hits=0 regardless of actual usage. Now wired up via a returned summarize() callback the digest invokes via defer; counters log as plain ints (not Sprintf'd strings) so structured-log backends keep them numeric; the resolver isn't constructed at all on degenerate runs (no PRs, no orgs configured) so the summary line only appears when the digest actually had work to do.

Ticket Link

https://mattermost.atlassian.net/browse/MM-68539

For sysadmins: SLA backfill summary log

Once per local-day digest run, the plugin emits one Info line:

[INFO] SLA digest backfill summary  timeline_pages_fetched=N  kv_hits=M
  • timeline_pages_fetched — distinct PRs whose GitHub issue timeline the digest had to pull because no cached SLA-start was on file.
  • kv_hits(PR, reviewer) pairs whose SLA-start time came from KV (either from a live review_requested webhook or a previous run's backfill).

Together these tell you whether the SLA backfill is healthy. The line lives at Info on purpose, so admins don't have to enable Debug logging just to spot trouble.

What healthy looks like

Day timeline_pages_fetched kv_hits Meaning
1 (fresh install) ~equal to # of open PRs with reviewers low Cold cache; almost everything backfilled from the timeline.
2 small rises sharply Yesterday's backfilled times now cached.
14+ (steady state) 1–2 dominates Only new PRs since last run need backfilling.

The shape to internalize: after a few days, kv_hits should dominate; timeline_pages_fetched tracks new PR activity since the last run.

What a problem looks like

Pattern Likely cause Where to look next
timeline_pages_fetched stays high day after day, kv_hits near 0 Backfilled times aren't being persisted Look for Failed to store review SLA start time Warn entries in the same log window
Same shape, no warn logs about KV writes Timeline returning nothing for every reviewer (e.g. all events are review_request_removed, GitHub schema drift, scope/permission issue on the service user's token) Enable Debug; look for SLA digest timeline fetch failed
timeline_pages_fetched >> count of open PRs in your configured org(s) Per-run cache is broken, or overlapping orgs are double-walking the same PRs Cross-check OrganizationsList config; report as a bug
Line never appears Digest didn't actually run Look for SLA digest cannot run warnings (no orgs configured / no service user available / no PRs found)

Quick triage when "the days-overdue numbers look wrong"

  1. Find the most recent SLA digest backfill summary line in the server log.
  2. Is kv_hits proportional to the number of overdue items in the channel post? If not, the cache isn't being used → days-overdue is falling back to PR-open dates → numbers are inflated.
  3. Is timeline_pages_fetched plausible vs. the open-PR count? If it's far above, the per-run cache is broken; if it stays high day after day with no kv_hits, the writeback is broken.
  4. Cross-reference with SLA digest timeline fetch failed Debug entries (need Debug logging enabled to see those) to separate "fetch failing" from "fetch returning nothing."

Test plan

  • go test ./server/... passes
  • go vet ./server/... clean
  • No new linter errors
  • New test TestNewDigestSLAStartResolver_LogsAccurateCounters locks in both invariants of the summary-log fix (no fire at construction time, exactly one fire when summarize() is called, with correct counters)
  • New tests for prRefFromIssue, gatherReviewersForPR, and pickServiceGitHubUser (deterministic ordering, broken-token skip, no-connected-users)
  • Sanity-check on a real Mattermost install: run the digest, confirm a non-zero summary line appears in the server log
  • Sanity-check after a second run: confirm kv_hits rises and timeline_pages_fetched drops

Change Impact: 🔴 High

Reasoning: This PR makes broad, cross-cutting changes to the SLA digest system: it replaces per-user REST searches with an org-wide GraphQL discovery path, introduces timeline-based SLA backfill and KV persistence, and updates server and webapp presentation (including new SLA badges). The work spans multiple core modules and alters data-gathering, persistence, and user-facing behavior, so the blast radius and potential for subtle failures are large despite added unit tests.

Regression Risk: High — the new GraphQL-based discovery, pagination, timeline backfill, and KV caching touch core data paths and shared utilities; incorrect pagination, timeline interpretation, or KV handling could lead to missed or misattributed SLA anchors and UI/display inconsistencies. Although extensive tests were added, there are still integration risks across server↔client boundaries and failure/cancellation paths.

QA Recommendation: Strong manual QA and integration testing are recommended. Validate org-wide PR discovery with mixed direct/team requests, SLA backfill correctness on PRs lacking webhook history, KV persistence/resilience across restarts, correct reviewer rendering for connected vs unconnected users, SLA-badge states in the RHS and sidebar, and failure/cancellation behaviors (ensure the daily marker isn’t advanced on incomplete scans). Automated end-to-end tests covering GraphQL → digest → UI rendering should be added before broad rollout.

Generated by CodeRabbitAI

Drafts are not ready for review and shouldn't appear in /github todo,
the daily reminder DM, or the SLA overdue digest. Add draft:false to
the shared review search query so all three call sites filter drafts
consistently.
Replace the per-day buckets with seven fixed ranges (1-3 / 4-7 / 8-14 /
15-30 / 31-90 / 91-365 / >1 year) so a digest with many overdue PRs
stays scannable instead of producing dozens of single-day headers.

Also surface the configured SLA target in the top-level header so
readers can see what threshold the buckets are measured against.
Render the PR title as a markdown link so readers can click straight
through from the channel digest to the pull request. Escape `]` and
`\` in titles so JIRA-prefixed names like "[MM-12345] Fix thing" do
not terminate the link early.
Map each reviewer's GitHub login to their Mattermost username via the
existing getGitHubToUsernameMapping helper so the digest produces real
Mattermost @-mentions:

  - @harrison (hmhealey) - owner/repo - [title](url)

Reviewers who haven't connected the GitHub plugin render as
"(not connected) - <githublogin>" so the digest still surfaces them
without misleading anyone with an @ that won't notify.
The digest's days-overdue clock previously used the cached time of the
review_requested webhook, falling back to the PR open date when no KV
record existed. PRs whose most recent review was requested before the
plugin was installed therefore showed nonsense numbers like 86 days
overdue when the actual outstanding review was a few days old.

When the digest collector encounters a PR with no cached SLA start, it
now pages the issue timeline once to find the most recent surviving
review_requested event for the reviewer and caches the result under
the existing slarr_v1_* key. Subsequent runs are free.

Only the digest path uses the new effectiveReviewSLAStartForDigest
variant; /github todo and the RHS panel keep using the no-network
effectiveReviewSLAStart so user-facing requests stay fast.
The digest used to iterate over each connected Mattermost user and run a
separate GitHub search per user. That meant a reviewer was only included
when they had personally connected their GitHub account, which is why the
first run only listed hmhealey.

Replace that loop with a single org-wide GraphQL search per configured
organization. The query asks for every open non-draft PR plus its
requestedReviewers (users and teams). Team review requests are expanded
once per run via the Teams API, with results cached so we don't refetch
the same team on every PR. Each (PR, reviewer) pair is deduped before
being scored against the SLA target.

The org scan needs an authenticated caller, so we pick the first
connected Mattermost user as a service user. If no Mattermost user has
connected a GitHub account yet, the digest logs a warning and skips
posting rather than running with no token.

Also cleans up an embedded-field selector in
findMostRecentReviewRequestTime that staticcheck flagged after the
timeline backfill commit landed.
The "Pull Requests Needing Review" panel showed each PR's open date but
not whether the review was actually overdue against the configured SLA,
so engineers had no way to spot urgent items at a glance.

Render an SLA badge per review item with one of three states:
"Overdue by N days" (red), "Due today" (amber), or "Due in N days"
(neutral). The badge only appears in the REVIEWS panel and only when an
admin has set a positive Review Target Days; other RHS panels and
unconfigured installs are unchanged.

Extract the SLA-start and overdue computation into webapp/src/utils/sla
so the existing red-button indicator on the sidebar buttons and the new
per-item badge agree on what "overdue" means. The math mirrors the
server's calendar-day diff (UTC midnight to UTC midnight) so a PR that
is overdue in the digest is overdue in the RHS too. The shared util
prefers review_sla_start (which the server already populates from the
review_request webhook or timeline backfill) and falls back to
created_at, matching the digest's effectiveReviewSLAStart.

Also tightens the configuration reducer's typing so review_target_days
flows through getSidebarData without needing per-call assertions.
orgOpenPRsQuery was a package-level var that the graphql library
decoded into. Concurrent callers — e.g. the daily digest scheduler
running alongside an LHS fetch on a different goroutine — could
clobber each other's Search.Nodes mid-decode, producing partial or
duplicated PR lists.

Define the type at package scope (so the graphql library can still
reflect on its tags) but instantiate locally per call so each caller
writes into its own struct.
The digest's collectAllOverdueSLAItems was a 90-line function that
forged synthetic *github.Issue values to feed into helpers designed
for search-result PRs. Replace that with a small abstraction and
pull each concern into its own function that can be tested in
isolation.

- Introduce prRef as the minimal PR identity SLA code needs (owner,
  repo, number, created_at). effectiveReviewSLAStart now takes prRef
  directly; user-facing call sites build one via prRefFromIssue. The
  digest constructs prRef directly from graphql.DigestPR with no
  forging.
- Extract storeReviewSLAStart so the live review_requested webhook
  and the timeline backfill share one wire-format codepath instead
  of duplicating the RFC3339Nano encode-and-Set logic.
- Extract fetchPRTimeline so timeline events can be cached at the
  digest level. Previously a PR with N requested reviewers triggered
  N timeline fetches; now it's one fetch shared across all reviewers
  on that PR.
- Split collectAllOverdueSLAItems into fetchAllOrgOpenPRs,
  newTeamMemberResolver, gatherReviewersForPR,
  newDigestSLAStartResolver, and evaluateOverdueForReviewer.
- Make pickServiceGitHubUser deterministic by sorting all token-key
  pages before iterating. Without this, a multi-node cluster could
  silently pick different service users on different days as KV
  iteration order shifted, changing which private repos and teams
  the digest could see.
- newDigestSLAStartResolver returns a summarize() callback the
  caller invokes via defer, so its Info-level "backfill summary" log
  fires after the resolver has actually been used (rather than at
  closure-construction time as a defer in the constructor would have
  done). Counters log as plain ints so structured-log backends can
  scrape them numerically. The digest skips constructing the
  resolver entirely when no PRs were fetched, so the summary line
  doesn't fire on degenerate runs.
- Document in the review search query that drafts are excluded but
  the SLA clock still runs through any draft period: webhook-recorded
  starts persist regardless of draft state, so a PR review-requested
  while draft and later un-drafted retains its original SLA start.

New tests cover prRefFromIssue, gatherReviewersForPR,
pickServiceGitHubUser (deterministic ordering, broken-token skip,
no-connected-users), and the summarize() semantics of
newDigestSLAStartResolver (no fire at construction, exactly one fire
on summarize()).
@jgheithcock jgheithcock requested a review from a team as a code owner April 28, 2026 06:13
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an org-wide GraphQL query to fetch open, non-draft PRs with requested reviewers; refactors SLA start computation to use a PR-ref and timeline-backed KV resolver; updates digest collection/formatting and reviewer expansion/backfill; and surfaces SLA badges in the UI with Redux plumbing.

Changes

Cohort / File(s) Summary
GraphQL Digest Query
server/plugin/graphql/digest_query.go
New GetOpenPRsWithRequestedReviewers plus DigestPR/DigestTeamRef types to page org-wide open, non-draft PRs and flatten requested reviewer data (users and teams).
SLA Computation & Timeline Helpers
server/plugin/review_sla.go, server/plugin/review_sla_test.go
Refactor to use prRef abstraction; add storeReviewSLAStart, timeline fetch/pagination, helpers to find user/team request anchors, and unit tests for prRefFromIssue and timeline-based selectors.
Digest Pipeline & Backfill
server/plugin/sla_digest.go, server/plugin/sla_digest_test.go
Switch digest to single org GraphQL scan using a deterministic service user; memoized team-member expansion; per-PR timeline backfill with KV persistence; dedupe reviewers case-insensitively; fixed overdue buckets; message formatting changes and extensive tests.
Plugin Integration
server/plugin/plugin.go
Adjust GetToDo call site to pass prRefFromIssue(pr, baseURL) into SLA start computation.
Search & Formatting Utilities
server/plugin/utils.go
Add draft:false to PR search query; replace per-PR row generator with formatChannelOverduePRBody; add escapeMarkdownLinkText.
Frontend: Sidebar SLA UI
webapp/src/components/sidebar_right/github_items.tsx, webapp/src/components/sidebar_right/sidebar_right.jsx, webapp/src/components/sidebar_buttons/sidebar_buttons.jsx
Thread reviewTargetDays/showReviewSLA through props, render SLA badge with themed styles, use shared overdue detection, and guard against non-positive targetDays.
Frontend: Redux & Types
webapp/src/selectors.ts, webapp/src/reducers/index.ts, webapp/src/types/github_types.ts
Expose reviewTargetDays from sidebar selector (default 0), tighten reducer typings, and extend types with showReviewSLA/reviewTargetDays.
Frontend: SLA Utils
webapp/src/utils/sla.ts
Add ReviewSLAStatus, getReviewSLAStartIso, getReviewSLAStatus, and reviewsHaveOverdue for client-side SLA calculations using UTC calendar days.

Sequence Diagram(s)

sequenceDiagram
    participant Plugin as Plugin Service
    participant GraphQL as GraphQL Client
    participant GitHub as GitHub API
    participant KV as KV Store
    participant Timeline as Timeline API

    Plugin->>GraphQL: GetOpenPRsWithRequestedReviewers(org)
    GraphQL->>GitHub: Search open non-draft PRs with reviewRequests (paginated)
    GitHub-->>GraphQL: DigestPR[] (users + teams)
    GraphQL-->>Plugin: DigestPR[]

    Plugin->>KV: Batch lookup review_sla_start for PR refs
    KV-->>Plugin: cached timestamps (hits/misses)

    alt KV miss for PR
        Plugin->>Timeline: fetchPRTimeline(owner, repo, number)
        Timeline-->>Plugin: timeline events (paginated)
        Plugin->>Plugin: compute SLA start from events (user/team rules)
        Plugin->>KV: storeReviewSLAStart(prRef, timestamp)
    end

    Plugin->>Plugin: expand teams -> members (memoized)
    Plugin->>Plugin: dedupe reviewers, compute overdue buckets
    Plugin->>Plugin: build digest message and advance daily marker if scan ok
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested labels

2: Dev Review, 3: QA Review

Suggested reviewers

  • avasconcelos114
  • ogi-m

Poem

🐰 I hopped through GraphQL fields so wide,

Collected PRs where reviewers hide,
Timelines checked and buckets arrayed,
Badges bloom where SLAs are weighed,
A tiny hop for digest pride.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "MM-68539: SLA digest fixes" is partially related to the changeset, referring to SLA digest improvements but not capturing the main scope of the changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch MM-68539-sla-digest-fixes

Comment @coderabbitai help to get the list of available commands and usage tips.

CI's gofmt check flagged the var-block in
TestNewDigestSLAStartResolver_LogsAccurateCounters: the type column
was over-indented by one space because each name had been padded by
one too many.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
webapp/src/components/sidebar_buttons/sidebar_buttons.jsx (1)

215-227: ⚠️ Potential issue | 🟡 Minor

Align disabled-SLA guard with shared utility semantics.

Line 216 should disable styling for non-positive values, not only falsy values, to avoid showing a misleading green state when targetDays is negative.

Suggested fix
 function reviewButtonStyle(base, reviews, targetDays) {
-    if (!targetDays) {
+    if (!targetDays || targetDays <= 0) {
         return base;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@webapp/src/components/sidebar_buttons/sidebar_buttons.jsx` around lines 215 -
227, The guard in reviewButtonStyle currently treats any falsy targetDays as
disabled, which misses negative values; update the check to treat non-positive
SLAs as disabled by changing the condition to explicitly return base when
targetDays is <= 0 (i.e., replace the falsy check with a numeric check), keeping
the rest of the logic (using reviewsHaveOverdue and returning color variants)
unchanged so negative targetDays won't show the green "online" state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/plugin/sla_digest.go`:
- Around line 250-258: gatherReviewersForPR currently flattens team requests to
plain logins losing team provenance; change it to return reviewer entries that
preserve origin (e.g., a small struct or tuple with Login and either TeamRef or
RequestTimestamp) instead of []string so downstream code (and
findMostRecentReviewRequestTime) can use team-scoped request times; update
gatherReviewersForPR, its call sites, and the resolver to propagate and prefer
the team request timestamp (or use the TeamRef to look up the team request
event) when computing SLA rather than falling back to pr.CreatedAt.
- Around line 154-171: The current SLA digest function returns nil on early
failures (no orgs, no service user, or no PRs), which the caller
maybePostDailyOverdueSLADigest treats the same as a successful empty run and
advances slaDigestDayKVKey; change the function to return an explicit error or
status so the caller can distinguish “could not run” from “ran and found
nothing.” Specifically, update the method signature to return (resultOK bool,
err error) or error, return a non-nil error (or resultOK=false) when
pickServiceGitHubUser, githubConnectUser, graphQLConnect, or fetchAllOrgOpenPRs
fail to produce data, and keep returning a success/empty result only when the
scan truly completed; then adjust the caller to only advance slaDigestDayKVKey
when resultOK is true and err is nil.
- Around line 321-325: The summary closure currently always logs a "SLA digest
backfill summary" even when no lookups ran; update the summarize function (the
closure assigned to summarize inside collectAllOverdueSLAItems /
summarizeSLAStart flow) to early-return when both fetched == 0 && hits == 0 so
the p.client.Log.Info call is skipped in that case, leaving the existing log
intact for non-zero activity.

---

Outside diff comments:
In `@webapp/src/components/sidebar_buttons/sidebar_buttons.jsx`:
- Around line 215-227: The guard in reviewButtonStyle currently treats any falsy
targetDays as disabled, which misses negative values; update the check to treat
non-positive SLAs as disabled by changing the condition to explicitly return
base when targetDays is <= 0 (i.e., replace the falsy check with a numeric
check), keeping the rest of the logic (using reviewsHaveOverdue and returning
color variants) unchanged so negative targetDays won't show the green "online"
state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d34db2c6-3623-475e-977e-2701615ea919

📥 Commits

Reviewing files that changed from the base of the PR and between b573b96 and d2eddaf.

📒 Files selected for processing (15)
  • server/plugin/graphql/digest_query.go
  • server/plugin/plugin.go
  • server/plugin/review_sla.go
  • server/plugin/review_sla_test.go
  • server/plugin/sla_digest.go
  • server/plugin/sla_digest_test.go
  • server/plugin/utils.go
  • webapp/src/components/sidebar_buttons/sidebar_buttons.jsx
  • webapp/src/components/sidebar_right/github_items.tsx
  • webapp/src/components/sidebar_right/index.jsx
  • webapp/src/components/sidebar_right/sidebar_right.jsx
  • webapp/src/reducers/index.ts
  • webapp/src/selectors.ts
  • webapp/src/types/github_types.ts
  • webapp/src/utils/sla.ts

Comment thread server/plugin/sla_digest.go
Comment thread server/plugin/sla_digest.go Outdated
Comment thread server/plugin/sla_digest.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 366-369: The test is brittle because the plugintest API mock
assumes exactly 5 args for LogWarn/LogDebug while pickServiceGitHubUser calls
them with 1, 3, or 5; update the two api.On("LogWarn", ...) and
api.On("LogDebug", ...) setups in sla_digest_test.go to use a variadic-aware
matcher (e.g., mock.MatchedBy with a validator) instead of fixed five
mock.Anything entries so the mock accepts any valid arity (or specifically
1,3,5) — target the LogWarn and LogDebug expectations created on the api
variable (plugintest.API{}) and replace the arity-specific argument list with a
single mock.MatchedBy validator that returns true for acceptable argument
slices.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d573f40d-0171-4531-aeba-29d03017934d

📥 Commits

Reviewing files that changed from the base of the PR and between d2eddaf and be69b1c.

📒 Files selected for processing (1)
  • server/plugin/sla_digest_test.go

Comment thread server/plugin/sla_digest_test.go
Four findings, all addressed:

1. (sidebar_buttons.jsx) Treat negative reviewTargetDays as disabled.
   reviewButtonStyle's `!targetDays` guard let a negative value through and
   would render the green "online" indicator instead of leaving the button
   neutral. Match the canonical `!targetDays || targetDays <= 0` guard used by
   getReviewSLAStatus / reviewsHaveOverdue.

2. (sla_digest.go) Differentiate "digest could not run" from "nothing
   overdue." collectAllOverdueSLAItems now returns (entries, ok). ok is false
   when no orgs are configured, no connected service user is available, or
   every configured org's GraphQL search failed (tracked via a new
   anyOrgOK return from fetchAllOrgOpenPRs). The caller only advances
   slaDigestDayKVKey when ok is true, so a transient first-of-day failure
   no longer skips every retry until tomorrow.

3. (sla_digest.go) Suppress the "SLA digest backfill summary" log when
   neither timeline_pages_fetched nor kv_hits moved. Prevents a misleading
   "0,0" line on degenerate runs (e.g. PRs without requested reviewers, or
   every reviewer hitting the early-return paths in resolve).

4. (sla_digest.go / review_sla.go) Preserve team-request provenance through
   SLA evaluation. gatherReviewersForPR now returns []reviewerRequest with
   each reviewer's originating teams attached and dedupes case-insensitively
   on login. The digest resolver, when no user-scoped review_requested
   timeline event exists for a reviewer, falls back to the earliest still-
   surviving team-scoped review_requested event via the new
   findEarliestSurvivingTeamRequestTime helper. This fixes the cold-cache
   case where a user pending solely because their team was requested would
   fall all the way back to pr.CreatedAt and overstate days-overdue. The
   resolved time is written to KV under the user-scoped key so subsequent
   runs hit the fast path.

New tests:
- TestGatherReviewersForPR cases for team-only origin, dedup of direct +
  team into a single entry retaining team origin, multi-team accumulation,
  and case-insensitive dedup.
- TestNewDigestSLAStartResolver_SkipsSummaryLogWhenNoLookups locks in the
  guarded summarize() behavior.
- TestFindEarliestSurvivingTeamRequestTime covers add/remove, multi-team
  earliest-wins semantics, defensive sort, and case-insensitive slug match.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
server/plugin/sla_digest_test.go (1)

357-430: Add a cancellation-path regression test for digest collection.

Please add a test that asserts collectAllOverdueSLAItems returns ok=false when ctx.Err() is set mid-scan, so partial scans can’t be marked complete again.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/plugin/sla_digest_test.go` around lines 357 - 430, Add a new test that
verifies collectAllOverdueSLAItems returns ok=false when the context is canceled
mid-scan: use setupServiceUserPickTest to create p and mocks, create a context
with cancel, arrange one of the mocked calls used during scanning (e.g., the kv
store Get or timeline fetch mock used by
newDigestSLAStartResolver/collectAllOverdueSLAItems) to block on a channel so
the scan is in-progress, start collectAllOverdueSLAItems in a goroutine, cancel
the context, unblock the mock, then assert the returned ok is false and that no
summary log is emitted; reference collectAllOverdueSLAItems,
newDigestSLAStartResolver, setupServiceUserPickTest to locate where to wire the
blocking behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/plugin/sla_digest.go`:
- Around line 197-199: The current check treats a canceled context as success by
returning out, true when ctx.Err() != nil; change this to return out, false (or
otherwise indicate failure) so context cancellations are not considered
successful completions — locate the ctx.Err() check and replace the returned
success flag (the second return value) from true to false in the function that
uses the out variable.

---

Nitpick comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 357-430: Add a new test that verifies collectAllOverdueSLAItems
returns ok=false when the context is canceled mid-scan: use
setupServiceUserPickTest to create p and mocks, create a context with cancel,
arrange one of the mocked calls used during scanning (e.g., the kv store Get or
timeline fetch mock used by newDigestSLAStartResolver/collectAllOverdueSLAItems)
to block on a channel so the scan is in-progress, start
collectAllOverdueSLAItems in a goroutine, cancel the context, unblock the mock,
then assert the returned ok is false and that no summary log is emitted;
reference collectAllOverdueSLAItems, newDigestSLAStartResolver,
setupServiceUserPickTest to locate where to wire the blocking behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 26d95991-2c59-4bf4-a843-05bab10350d1

📥 Commits

Reviewing files that changed from the base of the PR and between be69b1c and b085e9d.

📒 Files selected for processing (5)
  • server/plugin/review_sla.go
  • server/plugin/review_sla_test.go
  • server/plugin/sla_digest.go
  • server/plugin/sla_digest_test.go
  • webapp/src/components/sidebar_buttons/sidebar_buttons.jsx
🚧 Files skipped from review as they are similar to previous changes (1)
  • webapp/src/components/sidebar_buttons/sidebar_buttons.jsx

Comment thread server/plugin/sla_digest.go
golangci-lint's modernize linter (the rule that runs as part of
`make check-style`) flagged `team := team` inside the
RequestedTeams loop. Go 1.22+ gives every iteration its own loop
variable, so the manual capture is no longer needed. The inner
`add` call already dereferences `*team` and copies the value into
out[idx].Teams, so removing the shadow does not affect aliasing
either way.

Caught by `make check-style`; the previous commit was checked
with go vet / gofmt / ReadLints / eslint only, which all miss
golangci-lint's modernize ruleset.
CodeRabbit caught that the previous "differentiate could-not-run from
nothing-overdue" change still returned ok=true on a context-canceled
mid-scan, which would advance slaDigestDayKVKey for an interrupted
(partial) walk and suppress retries until the next local day.

Two call sites had this hole; both now return (nil, false) on
ctx.Err():

1. collectAllOverdueSLAItems' per-PR loop: a partial entry list is
   not a real scan, so the caller must not record the day marker.

2. fetchAllOrgOpenPRs' per-org loop: even if some orgs already
   responded successfully, a cancellation interrupted the walk;
   anyOK must be false so the caller treats it as an interrupted
   scan rather than a "real but quiet" one.

New test TestFetchAllOrgOpenPRs_CanceledContextReturnsNotOK locks
in (a) the (nil, false) contract and (b) that the cancellation path
fast-fails without dereferencing the GraphQL client.
Copy link
Copy Markdown
Contributor

@nang2049 nang2049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM one nitpick!

Comment thread server/plugin/sla_digest.go
Before: every overdue (reviewer, PR) pair rendered as its own bullet
with the reviewer prefix repeated on every row, so a reviewer with
five overdue PRs in a single bucket got @-mentioned five times in
that bucket and the digest was a wall of duplicated names.

After: within each bucket, entries are grouped by reviewer. Each
reviewer appears once as the outer bullet (@-mention is also issued
once per bucket), with the per-PR bodies as indented sub-bullets.
Reviewers within a bucket sort case-insensitively; PR bodies within
a reviewer group sort alphabetically. A reviewer who is overdue in
multiple buckets still gets one header per bucket they appear in.

  #### 1-3 days overdue
  - @alice (alice-gh)
    - mattermost/mattermost - [Fix encryption rotation](url)
  - @harrison (hmhealey)
    - mattermost/mattermost - [Fix race condition](url)
    - mattermost/plugin-github - [Add SLA digest](url)

Implementation:
- slaDigestEntry replaces its single Line field with separate
  ReviewerDisplay and Body fields so the builder can group without
  having to parse the pre-baked line back apart.
- formatChannelOverdueReviewLine becomes formatChannelOverduePRBody:
  returns just "<owner>/<repo> - [title](url)" so callers compose
  the outer bullet and the reviewer header. The same body is used
  under whichever reviewer header it belongs to.
- New groupBucketEntriesByReviewer helper handles the per-bucket
  collation with deterministic ordering (the digest is posted
  verbatim, so map iteration order alone would leave the output
  flaky run-to-run).

Tests:
- TestFormatChannelOverduePRBody retargets the existing line-shape
  invariants (escape, truncate, fallback URL) onto the new body
  helper, and adds an explicit "no leading bullet, no reviewer
  prefix" assertion locking in the composability contract.
- TestBuildSLADigestMessage gains:
    - "multiple PRs by the same reviewer in one bucket render under
      a single reviewer header with sorted indented bodies"
    - "two reviewers in the same bucket render as two separate
      reviewer groups"
    - direct unit test on groupBucketEntriesByReviewer for ordering
      and intra-group body sort.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
server/plugin/sla_digest_test.go (1)

517-518: ⚠️ Potential issue | 🟡 Minor

Make variadic log mocks arity-tolerant to avoid fragile tests.

On Line 517 and Line 518, LogWarn/LogDebug are mocked with exactly 5 args. These methods are variadic, so fixed arity can break when call sites log with different key/value counts.

Suggested test hardening
-	api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe()
-	api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe()
+	// Accept known arities emitted by the code paths under test (msg only, msg+1 kv pair, msg+2 kv pairs).
+	api.On("LogWarn", mock.Anything).Maybe()
+	api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything).Maybe()
+	api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe()
+	api.On("LogDebug", mock.Anything).Maybe()
+	api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything).Maybe()
+	api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe()
#!/bin/bash
# Verify actual logging call shapes vs test expectations (read-only).
rg -nP --type=go -C3 'func \(p \*Plugin\) pickServiceGitHubUser\(' server/plugin/sla_digest.go
rg -nP --type=go -C2 '\b(LogWarn|LogDebug)\(' server/plugin/sla_digest.go
rg -nP --type=go -C2 'api\.On\("LogWarn"|api\.On\("LogDebug"' server/plugin/sla_digest_test.go
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/plugin/sla_digest_test.go` around lines 517 - 518, The test is fragile
because api.On("LogWarn", ...) and api.On("LogDebug", ...) assert exactly five
args while the real methods are variadic; update the mocks to be arity-tolerant
by using a matcher for the variadic tail (e.g., keep matching the fixed leading
args you care about, then use mock.MatchedBy to accept any slice/extra key-value
pairs) so LogWarn and LogDebug expectations no longer fail when call sites pass
a different number of key/value pairs; update the api.On invocations for LogWarn
and LogDebug to use this matcher.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 517-518: The test is fragile because api.On("LogWarn", ...) and
api.On("LogDebug", ...) assert exactly five args while the real methods are
variadic; update the mocks to be arity-tolerant by using a matcher for the
variadic tail (e.g., keep matching the fixed leading args you care about, then
use mock.MatchedBy to accept any slice/extra key-value pairs) so LogWarn and
LogDebug expectations no longer fail when call sites pass a different number of
key/value pairs; update the api.On invocations for LogWarn and LogDebug to use
this matcher.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 091cbbb8-cfa8-4600-8ea3-25f1350c958e

📥 Commits

Reviewing files that changed from the base of the PR and between ef45f4e and ff472cf.

📒 Files selected for processing (3)
  • server/plugin/sla_digest.go
  • server/plugin/sla_digest_test.go
  • server/plugin/utils.go
✅ Files skipped from review due to trivial changes (1)
  • server/plugin/sla_digest.go

Address @nang2049's nit on PR #999 about the empty-login guard
being implicit. The current code is correct: both call paths
(pr.RequestedUsers and team-expanded logins) feed the shared add()
closure which has `if login == "" { return }` at the top, AND the
GraphQL layer in digest_query.go already drops empty user logins
upstream.

But the protection isn't visible at the loop sites, and there was
no test pinning the contract down — a future refactor that splits
the closure or replaces the helper could quietly drop the guard.

- Comment on add() now spells out that it's the single chokepoint
  for both paths and explains why the guard belongs there rather
  than being duplicated at each loop body.
- New test "empty logins are dropped from both direct and team-
  expanded paths" exercises empty strings on each input vector at
  once and asserts no reviewerRequest with Login=="" can reach the
  caller. If a future change removes the guard from add() or puts
  it in only one of the two loops, this test breaks loudly.
@jgheithcock jgheithcock merged commit 92cb37c into master Apr 28, 2026
19 checks passed
@jgheithcock jgheithcock deleted the MM-68539-sla-digest-fixes branch April 28, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants