MM-68539: SLA digest fixes#999
Conversation
Drafts are not ready for review and shouldn't appear in /github todo, the daily reminder DM, or the SLA overdue digest. Add draft:false to the shared review search query so all three call sites filter drafts consistently.
Replace the per-day buckets with seven fixed ranges (1-3 / 4-7 / 8-14 / 15-30 / 31-90 / 91-365 / >1 year) so a digest with many overdue PRs stays scannable instead of producing dozens of single-day headers. Also surface the configured SLA target in the top-level header so readers can see what threshold the buckets are measured against.
Render the PR title as a markdown link so readers can click straight through from the channel digest to the pull request. Escape `]` and `\` in titles so JIRA-prefixed names like "[MM-12345] Fix thing" do not terminate the link early.
Map each reviewer's GitHub login to their Mattermost username via the existing getGitHubToUsernameMapping helper so the digest produces real Mattermost @-mentions: - @harrison (hmhealey) - owner/repo - [title](url) Reviewers who haven't connected the GitHub plugin render as "(not connected) - <githublogin>" so the digest still surfaces them without misleading anyone with an @ that won't notify.
The digest's days-overdue clock previously used the cached time of the review_requested webhook, falling back to the PR open date when no KV record existed. PRs whose most recent review was requested before the plugin was installed therefore showed nonsense numbers like 86 days overdue when the actual outstanding review was a few days old. When the digest collector encounters a PR with no cached SLA start, it now pages the issue timeline once to find the most recent surviving review_requested event for the reviewer and caches the result under the existing slarr_v1_* key. Subsequent runs are free. Only the digest path uses the new effectiveReviewSLAStartForDigest variant; /github todo and the RHS panel keep using the no-network effectiveReviewSLAStart so user-facing requests stay fast.
The digest used to iterate over each connected Mattermost user and run a separate GitHub search per user. That meant a reviewer was only included when they had personally connected their GitHub account, which is why the first run only listed hmhealey. Replace that loop with a single org-wide GraphQL search per configured organization. The query asks for every open non-draft PR plus its requestedReviewers (users and teams). Team review requests are expanded once per run via the Teams API, with results cached so we don't refetch the same team on every PR. Each (PR, reviewer) pair is deduped before being scored against the SLA target. The org scan needs an authenticated caller, so we pick the first connected Mattermost user as a service user. If no Mattermost user has connected a GitHub account yet, the digest logs a warning and skips posting rather than running with no token. Also cleans up an embedded-field selector in findMostRecentReviewRequestTime that staticcheck flagged after the timeline backfill commit landed.
The "Pull Requests Needing Review" panel showed each PR's open date but not whether the review was actually overdue against the configured SLA, so engineers had no way to spot urgent items at a glance. Render an SLA badge per review item with one of three states: "Overdue by N days" (red), "Due today" (amber), or "Due in N days" (neutral). The badge only appears in the REVIEWS panel and only when an admin has set a positive Review Target Days; other RHS panels and unconfigured installs are unchanged. Extract the SLA-start and overdue computation into webapp/src/utils/sla so the existing red-button indicator on the sidebar buttons and the new per-item badge agree on what "overdue" means. The math mirrors the server's calendar-day diff (UTC midnight to UTC midnight) so a PR that is overdue in the digest is overdue in the RHS too. The shared util prefers review_sla_start (which the server already populates from the review_request webhook or timeline backfill) and falls back to created_at, matching the digest's effectiveReviewSLAStart. Also tightens the configuration reducer's typing so review_target_days flows through getSidebarData without needing per-call assertions.
orgOpenPRsQuery was a package-level var that the graphql library decoded into. Concurrent callers — e.g. the daily digest scheduler running alongside an LHS fetch on a different goroutine — could clobber each other's Search.Nodes mid-decode, producing partial or duplicated PR lists. Define the type at package scope (so the graphql library can still reflect on its tags) but instantiate locally per call so each caller writes into its own struct.
The digest's collectAllOverdueSLAItems was a 90-line function that forged synthetic *github.Issue values to feed into helpers designed for search-result PRs. Replace that with a small abstraction and pull each concern into its own function that can be tested in isolation. - Introduce prRef as the minimal PR identity SLA code needs (owner, repo, number, created_at). effectiveReviewSLAStart now takes prRef directly; user-facing call sites build one via prRefFromIssue. The digest constructs prRef directly from graphql.DigestPR with no forging. - Extract storeReviewSLAStart so the live review_requested webhook and the timeline backfill share one wire-format codepath instead of duplicating the RFC3339Nano encode-and-Set logic. - Extract fetchPRTimeline so timeline events can be cached at the digest level. Previously a PR with N requested reviewers triggered N timeline fetches; now it's one fetch shared across all reviewers on that PR. - Split collectAllOverdueSLAItems into fetchAllOrgOpenPRs, newTeamMemberResolver, gatherReviewersForPR, newDigestSLAStartResolver, and evaluateOverdueForReviewer. - Make pickServiceGitHubUser deterministic by sorting all token-key pages before iterating. Without this, a multi-node cluster could silently pick different service users on different days as KV iteration order shifted, changing which private repos and teams the digest could see. - newDigestSLAStartResolver returns a summarize() callback the caller invokes via defer, so its Info-level "backfill summary" log fires after the resolver has actually been used (rather than at closure-construction time as a defer in the constructor would have done). Counters log as plain ints so structured-log backends can scrape them numerically. The digest skips constructing the resolver entirely when no PRs were fetched, so the summary line doesn't fire on degenerate runs. - Document in the review search query that drafts are excluded but the SLA clock still runs through any draft period: webhook-recorded starts persist regardless of draft state, so a PR review-requested while draft and later un-drafted retains its original SLA start. New tests cover prRefFromIssue, gatherReviewersForPR, pickServiceGitHubUser (deterministic ordering, broken-token skip, no-connected-users), and the summarize() semantics of newDigestSLAStartResolver (no fire at construction, exactly one fire on summarize()).
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an org-wide GraphQL query to fetch open, non-draft PRs with requested reviewers; refactors SLA start computation to use a PR-ref and timeline-backed KV resolver; updates digest collection/formatting and reviewer expansion/backfill; and surfaces SLA badges in the UI with Redux plumbing. Changes
Sequence Diagram(s)sequenceDiagram
participant Plugin as Plugin Service
participant GraphQL as GraphQL Client
participant GitHub as GitHub API
participant KV as KV Store
participant Timeline as Timeline API
Plugin->>GraphQL: GetOpenPRsWithRequestedReviewers(org)
GraphQL->>GitHub: Search open non-draft PRs with reviewRequests (paginated)
GitHub-->>GraphQL: DigestPR[] (users + teams)
GraphQL-->>Plugin: DigestPR[]
Plugin->>KV: Batch lookup review_sla_start for PR refs
KV-->>Plugin: cached timestamps (hits/misses)
alt KV miss for PR
Plugin->>Timeline: fetchPRTimeline(owner, repo, number)
Timeline-->>Plugin: timeline events (paginated)
Plugin->>Plugin: compute SLA start from events (user/team rules)
Plugin->>KV: storeReviewSLAStart(prRef, timestamp)
end
Plugin->>Plugin: expand teams -> members (memoized)
Plugin->>Plugin: dedupe reviewers, compute overdue buckets
Plugin->>Plugin: build digest message and advance daily marker if scan ok
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
CI's gofmt check flagged the var-block in TestNewDigestSLAStartResolver_LogsAccurateCounters: the type column was over-indented by one space because each name had been padded by one too many.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
webapp/src/components/sidebar_buttons/sidebar_buttons.jsx (1)
215-227:⚠️ Potential issue | 🟡 MinorAlign disabled-SLA guard with shared utility semantics.
Line 216 should disable styling for non-positive values, not only falsy values, to avoid showing a misleading green state when
targetDaysis negative.Suggested fix
function reviewButtonStyle(base, reviews, targetDays) { - if (!targetDays) { + if (!targetDays || targetDays <= 0) { return base; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webapp/src/components/sidebar_buttons/sidebar_buttons.jsx` around lines 215 - 227, The guard in reviewButtonStyle currently treats any falsy targetDays as disabled, which misses negative values; update the check to treat non-positive SLAs as disabled by changing the condition to explicitly return base when targetDays is <= 0 (i.e., replace the falsy check with a numeric check), keeping the rest of the logic (using reviewsHaveOverdue and returning color variants) unchanged so negative targetDays won't show the green "online" state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/plugin/sla_digest.go`:
- Around line 250-258: gatherReviewersForPR currently flattens team requests to
plain logins losing team provenance; change it to return reviewer entries that
preserve origin (e.g., a small struct or tuple with Login and either TeamRef or
RequestTimestamp) instead of []string so downstream code (and
findMostRecentReviewRequestTime) can use team-scoped request times; update
gatherReviewersForPR, its call sites, and the resolver to propagate and prefer
the team request timestamp (or use the TeamRef to look up the team request
event) when computing SLA rather than falling back to pr.CreatedAt.
- Around line 154-171: The current SLA digest function returns nil on early
failures (no orgs, no service user, or no PRs), which the caller
maybePostDailyOverdueSLADigest treats the same as a successful empty run and
advances slaDigestDayKVKey; change the function to return an explicit error or
status so the caller can distinguish “could not run” from “ran and found
nothing.” Specifically, update the method signature to return (resultOK bool,
err error) or error, return a non-nil error (or resultOK=false) when
pickServiceGitHubUser, githubConnectUser, graphQLConnect, or fetchAllOrgOpenPRs
fail to produce data, and keep returning a success/empty result only when the
scan truly completed; then adjust the caller to only advance slaDigestDayKVKey
when resultOK is true and err is nil.
- Around line 321-325: The summary closure currently always logs a "SLA digest
backfill summary" even when no lookups ran; update the summarize function (the
closure assigned to summarize inside collectAllOverdueSLAItems /
summarizeSLAStart flow) to early-return when both fetched == 0 && hits == 0 so
the p.client.Log.Info call is skipped in that case, leaving the existing log
intact for non-zero activity.
---
Outside diff comments:
In `@webapp/src/components/sidebar_buttons/sidebar_buttons.jsx`:
- Around line 215-227: The guard in reviewButtonStyle currently treats any falsy
targetDays as disabled, which misses negative values; update the check to treat
non-positive SLAs as disabled by changing the condition to explicitly return
base when targetDays is <= 0 (i.e., replace the falsy check with a numeric
check), keeping the rest of the logic (using reviewsHaveOverdue and returning
color variants) unchanged so negative targetDays won't show the green "online"
state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d34db2c6-3623-475e-977e-2701615ea919
📒 Files selected for processing (15)
server/plugin/graphql/digest_query.goserver/plugin/plugin.goserver/plugin/review_sla.goserver/plugin/review_sla_test.goserver/plugin/sla_digest.goserver/plugin/sla_digest_test.goserver/plugin/utils.gowebapp/src/components/sidebar_buttons/sidebar_buttons.jsxwebapp/src/components/sidebar_right/github_items.tsxwebapp/src/components/sidebar_right/index.jsxwebapp/src/components/sidebar_right/sidebar_right.jsxwebapp/src/reducers/index.tswebapp/src/selectors.tswebapp/src/types/github_types.tswebapp/src/utils/sla.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 366-369: The test is brittle because the plugintest API mock
assumes exactly 5 args for LogWarn/LogDebug while pickServiceGitHubUser calls
them with 1, 3, or 5; update the two api.On("LogWarn", ...) and
api.On("LogDebug", ...) setups in sla_digest_test.go to use a variadic-aware
matcher (e.g., mock.MatchedBy with a validator) instead of fixed five
mock.Anything entries so the mock accepts any valid arity (or specifically
1,3,5) — target the LogWarn and LogDebug expectations created on the api
variable (plugintest.API{}) and replace the arity-specific argument list with a
single mock.MatchedBy validator that returns true for acceptable argument
slices.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d573f40d-0171-4531-aeba-29d03017934d
📒 Files selected for processing (1)
server/plugin/sla_digest_test.go
Four findings, all addressed: 1. (sidebar_buttons.jsx) Treat negative reviewTargetDays as disabled. reviewButtonStyle's `!targetDays` guard let a negative value through and would render the green "online" indicator instead of leaving the button neutral. Match the canonical `!targetDays || targetDays <= 0` guard used by getReviewSLAStatus / reviewsHaveOverdue. 2. (sla_digest.go) Differentiate "digest could not run" from "nothing overdue." collectAllOverdueSLAItems now returns (entries, ok). ok is false when no orgs are configured, no connected service user is available, or every configured org's GraphQL search failed (tracked via a new anyOrgOK return from fetchAllOrgOpenPRs). The caller only advances slaDigestDayKVKey when ok is true, so a transient first-of-day failure no longer skips every retry until tomorrow. 3. (sla_digest.go) Suppress the "SLA digest backfill summary" log when neither timeline_pages_fetched nor kv_hits moved. Prevents a misleading "0,0" line on degenerate runs (e.g. PRs without requested reviewers, or every reviewer hitting the early-return paths in resolve). 4. (sla_digest.go / review_sla.go) Preserve team-request provenance through SLA evaluation. gatherReviewersForPR now returns []reviewerRequest with each reviewer's originating teams attached and dedupes case-insensitively on login. The digest resolver, when no user-scoped review_requested timeline event exists for a reviewer, falls back to the earliest still- surviving team-scoped review_requested event via the new findEarliestSurvivingTeamRequestTime helper. This fixes the cold-cache case where a user pending solely because their team was requested would fall all the way back to pr.CreatedAt and overstate days-overdue. The resolved time is written to KV under the user-scoped key so subsequent runs hit the fast path. New tests: - TestGatherReviewersForPR cases for team-only origin, dedup of direct + team into a single entry retaining team origin, multi-team accumulation, and case-insensitive dedup. - TestNewDigestSLAStartResolver_SkipsSummaryLogWhenNoLookups locks in the guarded summarize() behavior. - TestFindEarliestSurvivingTeamRequestTime covers add/remove, multi-team earliest-wins semantics, defensive sort, and case-insensitive slug match.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
server/plugin/sla_digest_test.go (1)
357-430: Add a cancellation-path regression test for digest collection.Please add a test that asserts
collectAllOverdueSLAItemsreturnsok=falsewhenctx.Err()is set mid-scan, so partial scans can’t be marked complete again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/plugin/sla_digest_test.go` around lines 357 - 430, Add a new test that verifies collectAllOverdueSLAItems returns ok=false when the context is canceled mid-scan: use setupServiceUserPickTest to create p and mocks, create a context with cancel, arrange one of the mocked calls used during scanning (e.g., the kv store Get or timeline fetch mock used by newDigestSLAStartResolver/collectAllOverdueSLAItems) to block on a channel so the scan is in-progress, start collectAllOverdueSLAItems in a goroutine, cancel the context, unblock the mock, then assert the returned ok is false and that no summary log is emitted; reference collectAllOverdueSLAItems, newDigestSLAStartResolver, setupServiceUserPickTest to locate where to wire the blocking behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/plugin/sla_digest.go`:
- Around line 197-199: The current check treats a canceled context as success by
returning out, true when ctx.Err() != nil; change this to return out, false (or
otherwise indicate failure) so context cancellations are not considered
successful completions — locate the ctx.Err() check and replace the returned
success flag (the second return value) from true to false in the function that
uses the out variable.
---
Nitpick comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 357-430: Add a new test that verifies collectAllOverdueSLAItems
returns ok=false when the context is canceled mid-scan: use
setupServiceUserPickTest to create p and mocks, create a context with cancel,
arrange one of the mocked calls used during scanning (e.g., the kv store Get or
timeline fetch mock used by newDigestSLAStartResolver/collectAllOverdueSLAItems)
to block on a channel so the scan is in-progress, start
collectAllOverdueSLAItems in a goroutine, cancel the context, unblock the mock,
then assert the returned ok is false and that no summary log is emitted;
reference collectAllOverdueSLAItems, newDigestSLAStartResolver,
setupServiceUserPickTest to locate where to wire the blocking behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 26d95991-2c59-4bf4-a843-05bab10350d1
📒 Files selected for processing (5)
server/plugin/review_sla.goserver/plugin/review_sla_test.goserver/plugin/sla_digest.goserver/plugin/sla_digest_test.gowebapp/src/components/sidebar_buttons/sidebar_buttons.jsx
🚧 Files skipped from review as they are similar to previous changes (1)
- webapp/src/components/sidebar_buttons/sidebar_buttons.jsx
golangci-lint's modernize linter (the rule that runs as part of `make check-style`) flagged `team := team` inside the RequestedTeams loop. Go 1.22+ gives every iteration its own loop variable, so the manual capture is no longer needed. The inner `add` call already dereferences `*team` and copies the value into out[idx].Teams, so removing the shadow does not affect aliasing either way. Caught by `make check-style`; the previous commit was checked with go vet / gofmt / ReadLints / eslint only, which all miss golangci-lint's modernize ruleset.
CodeRabbit caught that the previous "differentiate could-not-run from nothing-overdue" change still returned ok=true on a context-canceled mid-scan, which would advance slaDigestDayKVKey for an interrupted (partial) walk and suppress retries until the next local day. Two call sites had this hole; both now return (nil, false) on ctx.Err(): 1. collectAllOverdueSLAItems' per-PR loop: a partial entry list is not a real scan, so the caller must not record the day marker. 2. fetchAllOrgOpenPRs' per-org loop: even if some orgs already responded successfully, a cancellation interrupted the walk; anyOK must be false so the caller treats it as an interrupted scan rather than a "real but quiet" one. New test TestFetchAllOrgOpenPRs_CanceledContextReturnsNotOK locks in (a) the (nil, false) contract and (b) that the cancellation path fast-fails without dereferencing the GraphQL client.
Before: every overdue (reviewer, PR) pair rendered as its own bullet with the reviewer prefix repeated on every row, so a reviewer with five overdue PRs in a single bucket got @-mentioned five times in that bucket and the digest was a wall of duplicated names. After: within each bucket, entries are grouped by reviewer. Each reviewer appears once as the outer bullet (@-mention is also issued once per bucket), with the per-PR bodies as indented sub-bullets. Reviewers within a bucket sort case-insensitively; PR bodies within a reviewer group sort alphabetically. A reviewer who is overdue in multiple buckets still gets one header per bucket they appear in. #### 1-3 days overdue - @alice (alice-gh) - mattermost/mattermost - [Fix encryption rotation](url) - @harrison (hmhealey) - mattermost/mattermost - [Fix race condition](url) - mattermost/plugin-github - [Add SLA digest](url) Implementation: - slaDigestEntry replaces its single Line field with separate ReviewerDisplay and Body fields so the builder can group without having to parse the pre-baked line back apart. - formatChannelOverdueReviewLine becomes formatChannelOverduePRBody: returns just "<owner>/<repo> - [title](url)" so callers compose the outer bullet and the reviewer header. The same body is used under whichever reviewer header it belongs to. - New groupBucketEntriesByReviewer helper handles the per-bucket collation with deterministic ordering (the digest is posted verbatim, so map iteration order alone would leave the output flaky run-to-run). Tests: - TestFormatChannelOverduePRBody retargets the existing line-shape invariants (escape, truncate, fallback URL) onto the new body helper, and adds an explicit "no leading bullet, no reviewer prefix" assertion locking in the composability contract. - TestBuildSLADigestMessage gains: - "multiple PRs by the same reviewer in one bucket render under a single reviewer header with sorted indented bodies" - "two reviewers in the same bucket render as two separate reviewer groups" - direct unit test on groupBucketEntriesByReviewer for ordering and intra-group body sort.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
server/plugin/sla_digest_test.go (1)
517-518:⚠️ Potential issue | 🟡 MinorMake variadic log mocks arity-tolerant to avoid fragile tests.
On Line 517 and Line 518,
LogWarn/LogDebugare mocked with exactly 5 args. These methods are variadic, so fixed arity can break when call sites log with different key/value counts.Suggested test hardening
- api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe() - api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe() + // Accept known arities emitted by the code paths under test (msg only, msg+1 kv pair, msg+2 kv pairs). + api.On("LogWarn", mock.Anything).Maybe() + api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything).Maybe() + api.On("LogWarn", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe() + api.On("LogDebug", mock.Anything).Maybe() + api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything).Maybe() + api.On("LogDebug", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Maybe()#!/bin/bash # Verify actual logging call shapes vs test expectations (read-only). rg -nP --type=go -C3 'func \(p \*Plugin\) pickServiceGitHubUser\(' server/plugin/sla_digest.go rg -nP --type=go -C2 '\b(LogWarn|LogDebug)\(' server/plugin/sla_digest.go rg -nP --type=go -C2 'api\.On\("LogWarn"|api\.On\("LogDebug"' server/plugin/sla_digest_test.go🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/plugin/sla_digest_test.go` around lines 517 - 518, The test is fragile because api.On("LogWarn", ...) and api.On("LogDebug", ...) assert exactly five args while the real methods are variadic; update the mocks to be arity-tolerant by using a matcher for the variadic tail (e.g., keep matching the fixed leading args you care about, then use mock.MatchedBy to accept any slice/extra key-value pairs) so LogWarn and LogDebug expectations no longer fail when call sites pass a different number of key/value pairs; update the api.On invocations for LogWarn and LogDebug to use this matcher.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@server/plugin/sla_digest_test.go`:
- Around line 517-518: The test is fragile because api.On("LogWarn", ...) and
api.On("LogDebug", ...) assert exactly five args while the real methods are
variadic; update the mocks to be arity-tolerant by using a matcher for the
variadic tail (e.g., keep matching the fixed leading args you care about, then
use mock.MatchedBy to accept any slice/extra key-value pairs) so LogWarn and
LogDebug expectations no longer fail when call sites pass a different number of
key/value pairs; update the api.On invocations for LogWarn and LogDebug to use
this matcher.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 091cbbb8-cfa8-4600-8ea3-25f1350c958e
📒 Files selected for processing (3)
server/plugin/sla_digest.goserver/plugin/sla_digest_test.goserver/plugin/utils.go
✅ Files skipped from review due to trivial changes (1)
- server/plugin/sla_digest.go
Address @nang2049's nit on PR #999 about the empty-login guard being implicit. The current code is correct: both call paths (pr.RequestedUsers and team-expanded logins) feed the shared add() closure which has `if login == "" { return }` at the top, AND the GraphQL layer in digest_query.go already drops empty user logins upstream. But the protection isn't visible at the loop sites, and there was no test pinning the contract down — a future refactor that splits the closure or replaces the helper could quietly drop the guard. - Comment on add() now spells out that it's the single chokepoint for both paths and explains why the guard belongs there rather than being duplicated at each loop body. - New test "empty logins are dropped from both direct and team- expanded paths" exercises empty strings on each input vector at once and asserts no reviewerRequest with Login=="" can reach the caller. If a future change removes the guard from add() or puts it in only one of the two loops, this test breaks loudly.
Summary
This branch hardens the daily PR-review SLA digest in three ways:
Coverage. The digest now scans every open non-draft PR in each configured org via a single GraphQL search, instead of running a per-user search for each connected Mattermost user (the prior approach silently skipped reviewers who hadn't connected their GitHub account). Direct user-requests and team-requests are flattened and de-duplicated; team membership is fetched once per digest run.
Correctness. The SLA clock now backfills from the PR timeline whenever the plugin doesn't have a cached
review_requestedwebhook record — for example because the PR was opened before the plugin was installed, or because the most-recent request preceded a Mattermost restart. Without this, PRs whose pending review was actually a few days old were showing up as "86 days overdue" against the configured target. Backfilled times are written back to KV so subsequent digest runs hit the fast path.Presentation. The digest groups overdue items into seven fixed buckets (1–3 / 4–7 / 8–14 / 15–30 / 31–90 / 91–365 / >1y), renders PR titles as markdown links (with
]/\escaping for JIRA-prefixed names), and renders each reviewer as their actual Mattermost @-mention viagetGitHubToUsernameMapping. Reviewers without a Mattermost account render as(not connected) - <login>so admins can see who's missing without producing dead @-mentions. The configured SLA target appears in the post header. The RHS panel grew per-item SLA badges (Overdue / Due today / Due in N days) sharing math with the existing sidebar red-button indicator via a newwebapp/src/utils/slamodule.The final two commits in this branch decompose
collectAllOverdueSLAItemsinto small testable helpers and fix two latent bugs uncovered while doing so:orgOpenPRsQuerywas a package-levelvarthat the graphql library decoded into. Concurrent callers — the digest scheduler running alongside an LHS fetch on a different goroutine — could clobber each other'sSearch.Nodesmid-decode. Now defined as a type at package scope (so the library can still reflect on its tags) but instantiated locally per call.timeline_pages_fetched=0 kv_hits=0regardless of actual usage. Now wired up via a returnedsummarize()callback the digest invokes via defer; counters log as plain ints (notSprintf'd strings) so structured-log backends keep them numeric; the resolver isn't constructed at all on degenerate runs (no PRs, no orgs configured) so the summary line only appears when the digest actually had work to do.Ticket Link
https://mattermost.atlassian.net/browse/MM-68539
For sysadmins: SLA backfill summary log
Once per local-day digest run, the plugin emits one Info line:
timeline_pages_fetched— distinct PRs whose GitHub issue timeline the digest had to pull because no cached SLA-start was on file.kv_hits—(PR, reviewer)pairs whose SLA-start time came from KV (either from a livereview_requestedwebhook or a previous run's backfill).Together these tell you whether the SLA backfill is healthy. The line lives at Info on purpose, so admins don't have to enable Debug logging just to spot trouble.
What healthy looks like
timeline_pages_fetchedkv_hitsThe shape to internalize: after a few days,
kv_hitsshould dominate;timeline_pages_fetchedtracks new PR activity since the last run.What a problem looks like
timeline_pages_fetchedstays high day after day,kv_hitsnear 0Failed to store review SLA start timeWarn entries in the same log windowreview_request_removed, GitHub schema drift, scope/permission issue on the service user's token)SLA digest timeline fetch failedtimeline_pages_fetched>> count of open PRs in your configured org(s)OrganizationsListconfig; report as a bugSLA digest cannot runwarnings (no orgs configured / no service user available / no PRs found)Quick triage when "the days-overdue numbers look wrong"
SLA digest backfill summaryline in the server log.kv_hitsproportional to the number of overdue items in the channel post? If not, the cache isn't being used → days-overdue is falling back to PR-open dates → numbers are inflated.timeline_pages_fetchedplausible vs. the open-PR count? If it's far above, the per-run cache is broken; if it stays high day after day with nokv_hits, the writeback is broken.SLA digest timeline fetch failedDebug entries (need Debug logging enabled to see those) to separate "fetch failing" from "fetch returning nothing."Test plan
go test ./server/...passesgo vet ./server/...cleanTestNewDigestSLAStartResolver_LogsAccurateCounterslocks in both invariants of the summary-log fix (no fire at construction time, exactly one fire whensummarize()is called, with correct counters)prRefFromIssue,gatherReviewersForPR, andpickServiceGitHubUser(deterministic ordering, broken-token skip, no-connected-users)kv_hitsrises andtimeline_pages_fetcheddropsChange Impact: 🔴 High
Reasoning: This PR makes broad, cross-cutting changes to the SLA digest system: it replaces per-user REST searches with an org-wide GraphQL discovery path, introduces timeline-based SLA backfill and KV persistence, and updates server and webapp presentation (including new SLA badges). The work spans multiple core modules and alters data-gathering, persistence, and user-facing behavior, so the blast radius and potential for subtle failures are large despite added unit tests.
Regression Risk: High — the new GraphQL-based discovery, pagination, timeline backfill, and KV caching touch core data paths and shared utilities; incorrect pagination, timeline interpretation, or KV handling could lead to missed or misattributed SLA anchors and UI/display inconsistencies. Although extensive tests were added, there are still integration risks across server↔client boundaries and failure/cancellation paths.
QA Recommendation: Strong manual QA and integration testing are recommended. Validate org-wide PR discovery with mixed direct/team requests, SLA backfill correctness on PRs lacking webhook history, KV persistence/resilience across restarts, correct reviewer rendering for connected vs unconnected users, SLA-badge states in the RHS and sidebar, and failure/cancellation behaviors (ensure the daily marker isn’t advanced on incomplete scans). Automated end-to-end tests covering GraphQL → digest → UI rendering should be added before broad rollout.
Generated by CodeRabbitAI