fix(Segment membership): Zero out (segment, env) pairs that stopped matching#7600
fix(Segment membership): Zero out (segment, env) pairs that stopped matching#7600khvn26 wants to merge 4 commits into
Conversation
…atching `refresh_project_segment_counts` builds the upsert list from `compute_segment_counts_for_project`, which only returns pairs with at least one match. Combined with `bulk_create(update_conflicts=True)` -- which only touches rows it's given -- a pair that previously matched but no longer does keeps the stale non-zero count forever. Saw this in practice on staging: editing a segment rule so that a previously-matching env dropped to zero left the old count visible in the UI through the next refresh cycle. Append explicit zero-count rows for every (segment, env) pair the project already has in the table but didn't appear in this run's result. The next bulk_create then drives those down to zero. Pairs that have never matched stay absent (no row pre-population). Log the zeroed-row count alongside the existing membership_counts__count on `refresh.project.completed` so this path is observable. beep boop
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
There was a problem hiding this comment.
Code Review
This pull request updates the segment membership count refresh task to explicitly zero out stale membership counts for segment-environment pairs that no longer match, preventing stale counts from lingering. It also adds corresponding unit tests and updates the observability documentation. The reviewer suggested a performance optimization to filter the existing pairs by count__gt=0 to avoid redundant database updates for rows that have already been zeroed out.
| existing_pairs = SegmentMembershipCount.objects.filter( | ||
| segment__project=project, | ||
| ).values_list("segment_id", "environment_id") |
There was a problem hiding this comment.
Currently, existing_pairs fetches all existing SegmentMembershipCount records for the project, regardless of their current count. This means that any pair that has already been zeroed out in a previous run will continue to be fetched, included in zeroed_counts, and updated via bulk_create on every subsequent run.
At scale, this results in a large number of redundant database writes (updates) for steady-state zero counts.
By filtering existing_pairs to only include records where count > 0, we ensure that we only zero out pairs that actually transitioned from a positive count to zero, avoiding redundant updates for already-zeroed rows.
existing_pairs = SegmentMembershipCount.objects.filter(
segment__project=project,
count__gt=0,
).values_list("segment_id", "environment_id")There was a problem hiding this comment.
Sorry, probably lacking context, but is there a benefit to adding zero count records instead of doing something like:
SegmentMembershipCount.objects.exclude(...).delete()Maybe we're planning on being able to show a history?
There was a problem hiding this comment.
Switched to delete in 8d259ad — same end state for the UI without keeping zero rows around.
There was a problem hiding this comment.
IDK if we should invest in history given we'll move from polling to CDC fairly soon.
Filter `existing_pairs` by `count__gt=0` so steady-state refreshes don't re-write zeros over zeros. Per gemini-code-assist on #7600. beep boop
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-16)Details
|
| # `count__gt=0` keeps steady-state refreshes off rows that are already | ||
| # zeroed -- writing zero over zero on every run is just churn. |
There was a problem hiding this comment.
Nit: this comment feels slop-y to me. It could just be:
Ignore records that already have `count=0` to avoid unnecessary writes.
There was a problem hiding this comment.
Replaced with delete in 8d259ad, the slop comment is gone.
| existing_pairs = SegmentMembershipCount.objects.filter( | ||
| segment__project=project, | ||
| ).values_list("segment_id", "environment_id") |
There was a problem hiding this comment.
Sorry, probably lacking context, but is there a benefit to adding zero count records instead of doing something like:
SegmentMembershipCount.objects.exclude(...).delete()Maybe we're planning on being able to show a history?
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7600 +/- ##
=======================================
Coverage 98.51% 98.51%
=======================================
Files 1436 1436
Lines 54363 54405 +42
=======================================
+ Hits 53553 53595 +42
Misses 810 810 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Per matthewelwell on #7600: zero-out rows just for the next refresh to read them isn't useful — we're not surfacing history anywhere — so drop them outright. The frontend already renders no chip when a membership row is absent, which is the same UX the explicit zero was reaching for. beep boop
Docker builds report
|
Visual Regression19 screenshots compared. See report for details. |
| # Drop pairs that stopped matching this run so the next refresh | ||
| # doesn't carry stale counts forward. |
There was a problem hiding this comment.
This feels slop-y again? I think it's referencing old context about 'carrying stale counts forward'
There was a problem hiding this comment.
Dropped the comment in 5da46f1 — the code is self-evident now.
| stale_deleted, _ = ( | ||
| SegmentMembershipCount.objects.filter(id__in=stale_ids).delete() | ||
| if stale_ids | ||
| else (0, {}) | ||
| ) |
There was a problem hiding this comment.
I suspect this would work the same, no? Surely 0, {} is the default response from .delete() and we're already working with a list, not None.
| stale_deleted, _ = ( | |
| SegmentMembershipCount.objects.filter(id__in=stale_ids).delete() | |
| if stale_ids | |
| else (0, {}) | |
| ) | |
| stale_deleted, _ = SegmentMembershipCount.objects.filter(id__in=stale_ids).delete() |
There was a problem hiding this comment.
Right — .delete() on an empty id__in=[] filter already returns (0, {}) without issuing SQL. Adopted the suggestion in 5da46f1.
Per matthewelwell on #7600: comment was carrying old context, and the `if stale_ids else (0, {})` guard is unnecessary since `.delete()` on an empty `id__in` filter already returns `(0, {})` without issuing SQL. beep boop
Thanks for submitting a PR! Please check the boxes below:
docs/if required so people know about the feature.Changes
Contributes to #5663.
refresh_project_segment_countsonly upserts pairs returned by compute (count > 0), so a pair that previously matched but no longer does keeps its stale non-zero count forever. Append explicit zero rows for every(segment, env)pair already in the table that didn't appear in this run; pairs that have never matched stay absent.How did you test this code?
Added
test_refresh_project_segment_counts__previously_matching_pair_drops_to_zero__row_zeroedandtest_refresh_project_segment_counts__never_matched_pair__no_row_written.