Skip to content

test(e2e): cover heatmap drag-select persistence (HDX-4147)#2209

Open
alex-fedotyev wants to merge 12 commits into
alex/HDX-4147-heatmap-select-persistfrom
alex/HDX-4147-heatmap-select-e2e
Open

test(e2e): cover heatmap drag-select persistence (HDX-4147)#2209
alex-fedotyev wants to merge 12 commits into
alex/HDX-4147-heatmap-select-persistfrom
alex/HDX-4147-heatmap-select-e2e

Conversation

@alex-fedotyev
Copy link
Copy Markdown
Contributor

@alex-fedotyev alex-fedotyev commented May 6, 2026

Summary

Adds Playwright e2e regression coverage for HDX-4147 (the heatmap drag-select rectangle bug fixed in #2189). Stacked on alex/HDX-4147-heatmap-select-persist; rebases on main once that PR merges and the diff drops to just the test commit.

The bug was that the dashed selection rectangle on the Event Deltas heatmap collapsed to a 2x2 px residue after mouseup. There was no automated test exercising the drag-select interaction, which is why a wrong fix was mergeable. This test locks in the fix and catches any future regression on the same surface.

What it tests

One spec at packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts, three test cases:

  1. Drag-select draws a persistent rectangle and writes URL state. Asserts URL gains xMin/xMax/yMin/yMax and .u-select has width AND height > 20 px (the 2x2 px collapse signature is well below this).
  2. Reloading the page restores the rectangle from URL state. Drag, capture rectangle dimensions, reload, assert the rectangle returns with the same dimensions. This is what specifically catches the ready-vs-onCreate timing issue in fix(heatmap): persist drag-select rectangle across re-renders #2189: at onCreate, u.scales.y.min/max aren't populated for mode-2 facet data, so the rectangle restore must run from uPlot's ready hook.
  3. Clicking off the rectangle clears both URL state and the rectangle. URL params drop, .u-select collapses below 5 px on each axis.

All assertions use expect.poll for URL and bounding-box reads. nuqs's setFields flushes the URL update asynchronously and uPlot's ready hook fires after first draw, so one-shot reads race with both.

Page object additions

SearchPage:

  • openEventDeltasMode(): switches to the Event Deltas analysis tab and waits for .uplot.
  • getHeatmap(), getHeatmapSelectionRect(): locators for the chart and its selection element.
  • dragHeatmapSelection(startX, startY, endX, endY): spans both axes so the resulting selection has non-zero width AND height (uPlot treats single-axis drags as zero-size and skips the hook).

Verification that the test catches the bug

I cherry-picked the two test commits onto plain main (which does NOT have #2189's fix) and ran CI on that branch (PR #2217, since closed). All three test cases failed, all retries (#1 and #2) also failed:

  • Test 1 failed at line 41 (the .u-select width poll).
  • Test 2 failed at line 61 (the .u-select width poll before reload).
  • Test 3 failed at line 119 (the click setup, because the persisted rectangle never existed to begin with).

CI run on bug-state code: https://github.com/hyperdxio/hyperdx/actions/runs/25450398876. CI run on fix-state code (this PR): https://github.com/hyperdxio/hyperdx/actions/runs/25449203405. Both runs ran the same spec; the only difference is whether selectionBounds plumbing and the ready-hook restore exist in DBHeatmapChart.tsx.

Test plan

  • yarn workspace @hyperdx/app run tsc --noEmit clean.
  • yarn workspace @hyperdx/app run ci:lint clean.
  • CI e2e shards 1-4 pass on this branch with the fix (ff2157d9).
  • CI e2e shard 3 fails on plain main without the fix; all three test cases fail with retries (verified via PR DO NOT MERGE: prove e2e tests fail without #2189 fix #2217, now closed).

Why this is a separate PR

#2189 is the fix; this is the regression guard. Splitting keeps each at a manageable size. Bundling the test on top of the fix would have pushed #2189's classifier from tier-3 to tier-4. Per notes/lessons/2026-05-06-pr-2189-verify-on-fresh-build.md, perf optimizations and bug fixes shouldn't ride together; tests for a bug also fit cleanly in their own PR.

The dashed selection rectangle on the Event Deltas heatmap collapses to
a 2x2 px box the instant the user releases the mouse. The filter is
applied correctly (URL params, comparison legend, bar charts all
update), but visually you can no longer see what region you picked.

Root cause: uplot-react's dataMatch deep-compares the data prop and
calls chart.setData(data, true) when refs differ. setData resets
uPlot's internal u.select rectangle. HeatmapContainer rebuilt the
[time, bucket, count] arrays in its render body, so on every parent
re-render those entries had new references with identical values, and
dataMatch returned false.

Memoize the data array with stable refs so dataMatch returns true,
setData is skipped, and u.select survives the re-render that follows
a drag (the URL param write triggers).

Also memoize generatedTsBuckets so its fresh-Date-array each render
doesn't propagate into the heatmapData useMemo deps.

HDX-4147
Lint feedback from CI:
- timestampColumn?.name dep is wrong since the closure uses the full
  timestampColumn object; pass the object instead. The ref is stable
  when data is stable, so this doesn't reintroduce the original bug.
- Stop destructuring bucket/count when only time is needed for the
  not-enough-data short-circuit.

Add changeset (patch on @hyperdx/app).
Bot review feedback: timestampColumn was a non-primitive useMemo dep
derived entirely from data?.meta. Move the call inside the useMemo so
data is the only relevant dep, which is the cleanest way to satisfy
both exhaustive-deps and the ref-stability the fix relies on.

Also drop the // x values / // y value series 1 / // y value series 2
labels — they describe what the line is rather than why.
prose-lint flags U+2014 anywhere; tighten the comment instead.
Drop the dataMatch description (it was element-wise === per series,
not top-level === as written) and consolidate to the actionable
mechanism: fresh data ref drives uplot-react to call setData(data, true)
which wipes u.select. Add HDX-4147 ticket reference.
The previous memoization fix was insufficient. The actual cause is that
uplot-react destroys and recreates the chart whenever its `options` prop
ref changes. `tickFormatter` was rebuilt on every render because
`numberFormat` is a fresh object reference from callers (e.g.
DBSearchHeatmapChart constructs `{output: 'duration', factor: 0.001}`
inline). Rebuilt tickFormatter cascaded into the `options` useMemo,
optionsUpdateState detected new top-level keys (series, axes, cursor,
plugins are new array/object literals on each recompute), classified the
change as 'create', and the chart was destroyed and recreated, wiping
`u.select`.

Two layers:

1. Stabilize `tickFormatter`'s dep on `numberFormat` content via
   JSON.stringify, so callers passing a fresh-each-render `numberFormat`
   no longer cause chart recreation.

2. Pass the URL-backed selection into `Heatmap` as a `selectionBounds`
   prop and reapply via `setSelect` on chart create + on the prop
   changing. This makes the URL the source of truth and survives any
   future cause of chart recreation (theme switch, resize bounce, etc.).
Page reload with a URL-encoded selection wasn't drawing the rectangle.
Root cause: at onCreate time, uPlot has constructed but not yet completed
its initial layout for mode-2 facet data. u.scales.y.min/max can still
be undefined, so applySelectionToChart's early-out triggered and setSelect
never got called.

Move the apply call to uPlot's `ready` hook, which fires once after the
initial draw completes. Hold selectionBounds and scaleType in refs so
the hook (captured inside the options useMemo) reads the latest values
without requiring memo dep churn.
Adds Playwright regression coverage for the bug fixed in PR #2189:
the dashed drag-select rectangle on the Event Deltas heatmap collapsed
to a 2x2 px residue after mouseup unless u.select is mirrored from the
URL state.

Three scenarios in one spec:

1. Drag-select draws a persistent rectangle and writes URL state.
   Asserts URL gains xMin/xMax/yMin/yMax AND the .u-select element
   has width/height > 20 px, well above the 2x2 px collapse signature.

2. Reloading the page restores the rectangle from URL state. This is
   what specifically catches the ready-vs-onCreate timing issue: at
   onCreate u.scales.y.min/max aren't populated for mode-2 facet data,
   so the apply must run from uPlot's ready hook. The test reloads
   after a drag and asserts the rectangle returns with the same
   dimensions.

3. Clicking off the rectangle clears both URL state and the
   rectangle (collapses below 5 px on each axis).

New SearchPage methods: openEventDeltasMode, getHeatmap,
getHeatmapSelectionRect, dragHeatmapSelection. The drag helper spans
both axes so the resulting selection has non-zero width AND height
(uPlot treats single-axis drags as zero-size and skips the hook).
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 6, 2026

⚠️ No Changeset found

Latest commit: be7749f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 12, 2026 5:45pm

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR Review

✅ No critical issues found.

Test-only PR; follows the existing dragHistogramToZoom page-object pattern, uses expect.poll consistently for nuqs/uPlot async flushes, and the PR description documents a CI run on bug-state code that confirmed all three cases fail without #2189. A couple of minor observations (not blockers):

  • ℹ️ event-deltas-heatmap.spec.ts:61-67 is the only expect.poll block missing a message: option → consider adding one for parity with the others when a poll fails.
  • ℹ️ Test 3 clicks at (0.9, 0.05) of the heatmap bounding box — that region overlaps the canvas wrapper's axis/title area depending on layout. If this flakes in CI later, switch to clicking inside the plotting area but outside the selection (e.g. (0.9, 0.15)).
  • ℹ️ getHeatmap() returns .uplot.first() — fine for Event Deltas today, but worth a tighter selector (e.g. parent data-testid) if the page ever renders a second uPlot above the heatmap.

@alex-fedotyev alex-fedotyev changed the base branch from alex/HDX-4147-heatmap-select-persist to main May 6, 2026 16:34
@github-actions github-actions Bot added the review/tier-2 Low risk — AI review + quick human skim label May 6, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🔵 Tier 2 — Low Risk

Small, isolated change with no API route or data model modifications.

Why this tier:

  • Standard feature/fix — introduces new logic or modifies core functionality

Review process: AI review + quick human skim (target: 5–15 min). Reviewer validates AI assessment and checks for domain-specific concerns.
SLA: Resolve within 4 business hours.

Stats
  • Production files changed: 0
  • Production lines changed: 0 (+ 216 in test files, excluded from tier calculation)
  • Branch: alex/HDX-4147-heatmap-select-e2e
  • Author: alex-fedotyev

To override this classification, remove the review/tier-2 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

E2E Test Results

All tests passed • 162 passed • 3 skipped • 1153s

Status Count
✅ Passed 162
❌ Failed 0
⚠️ Flaky 4
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

nuqs setFields and uPlot's ready hook both flush asynchronously after
mouseup and after page reload. The synchronous reads at line 27 (URL
params) and line 35/64 (post-reload boundingBox) raced with the flush
and failed in CI shard 2 across all 3 retries: the URL never carried
xMin= within the test window.

Wrap each URL read in expect.poll(...) per param and gate the rect
dimension reads on a polled width > 20 (or < 5 for the post-clear
collapse). Capture rect once after polling settles, then assert the
remaining dimensions and the cross-check round-trip.

Also fixes lint: simple-import-sort/imports, prettier multi-line
arg formatting on assertion messages.
@alex-fedotyev alex-fedotyev changed the base branch from main to alex/HDX-4147-heatmap-select-persist May 6, 2026 16:56
alex-fedotyev added a commit that referenced this pull request May 6, 2026
The three failing dashboard-container tests in CI shard 1 came down to
two distinct races:

1. Tests #3 (line 112) and #5 (line 194) read getActiveTabsParam()[id]
   synchronously after Add Tab and after tab switches. nuqs flushes URL
   state asynchronously, so the read fires before the param is written.
   Wrap each sync read in expect.poll, mirroring the fix in PR #2209.

2. Test #6 (line 231) paired toggleGroupBordered(idA) with
   addTabToGroup(idB) back-to-back. Both setDashboard calls produce()
   from the same pre-mutation snapshot of the React Query cache, so
   the second PATCH overwrites the first; the toggle is silently
   dropped. The save-and-reload assertion then sees the wrong state
   (or, when goto fires before the PATCH lands, no state at all).

   Narrow the round-trip to a single mutation per step (addTabToGroup
   on group B), wait for networkidle before navigating away, capture
   the dashboard id from the URL while we are still on the page, and
   use expect.poll for the post-reload getGroupOrder assertion. The
   bordered toggle stays covered by the dedicated in-page test at
   line 78. The underlying back-pressure race is tracked separately
   in #2216.
@alex-fedotyev alex-fedotyev force-pushed the alex/HDX-4147-heatmap-select-persist branch from 287183d to 41b87cc Compare May 8, 2026 00:18
alex-fedotyev added a commit that referenced this pull request May 8, 2026
The three failing dashboard-container tests in CI shard 1 came down to
two distinct races:

1. Tests #3 (line 112) and #5 (line 194) read getActiveTabsParam()[id]
   synchronously after Add Tab and after tab switches. nuqs flushes URL
   state asynchronously, so the read fires before the param is written.
   Wrap each sync read in expect.poll, mirroring the fix in PR #2209.

2. Test #6 (line 231) paired toggleGroupBordered(idA) with
   addTabToGroup(idB) back-to-back. Both setDashboard calls produce()
   from the same pre-mutation snapshot of the React Query cache, so
   the second PATCH overwrites the first; the toggle is silently
   dropped. The save-and-reload assertion then sees the wrong state
   (or, when goto fires before the PATCH lands, no state at all).

   Narrow the round-trip to a single mutation per step (addTabToGroup
   on group B), wait for networkidle before navigating away, capture
   the dashboard id from the URL while we are still on the page, and
   use expect.poll for the post-reload getGroupOrder assertion. The
   bordered toggle stays covered by the dedicated in-page test at
   line 78. The underlying back-pressure race is tracked separately
   in #2216.
…sist' into alex/HDX-4147-heatmap-select-e2e

# Conflicts:
#	packages/app/src/components/DBHeatmapChart.tsx
@github-actions
Copy link
Copy Markdown
Contributor

Deep Review

This PR adds three Playwright e2e tests plus page-object helpers for the heatmap drag-select interaction. No production code; review is scoped to test correctness, flake resistance, and assertion strength.

✅ No critical issues found.

🟡 P2 -- recommended

  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:115 -- click-to-clear coordinates (width*0.9, height*0.05) rely on the dragHeatmapSelection defaults staying inside (0.25-0.7, 0.3-0.7), and y=5% likely lands on uPlot's axis/title row rather than the plot area .u-over.
    • Fix: Compute the clear-click target from the same percentages used for the drag and scope it to .u-over to keep the test inside the plot rectangle.
    • testing, maintainability, correctness
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:119 -- between drag mouseup and the off-rectangle click the spec only polls the URL, which can resolve faster than 300ms and let the click hit DBHeatmapChart's justDraggedAtRef guard so the clear is silently dropped.
    • Fix: Poll getHeatmapSelectionRect().boundingBox().width > 20 after the drag and before the clearing click so the post-mouseup guard window has elapsed.
    • julik-frontend-races, correctness
  • packages/app/tests/e2e/page-objects/SearchPage.ts:318 -- getHeatmap() uses page.locator('.uplot').first(), which after page.reload() or once a sibling chart lands will silently bind to whichever uPlot mounts first instead of the Event Deltas heatmap.
    • Fix: Scope the locator to page.getByRole('tabpanel', { name: 'Event Deltas' }).locator('.uplot') or add a data-testid on the deltas heatmap container and target it.
    • julik-frontend-races, testing
  • packages/app/tests/e2e/page-objects/SearchPage.ts:354 -- dragHeatmapSelection issues mouse.down/up after only waitFor({ state: 'visible' }), which proves the wrapper exists but not that uPlot's ctor inside the React useEffect has wired pointer handlers, so the drag can silently no-op on slow CI scheduling.
    • Fix: Before issuing mouse events, await expect(getHeatmapSelectionRect()).toBeAttached() and poll the canvas bounding box for non-zero width/height.
    • julik-frontend-races
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:29 -- assertions only check toContain('xMin=') etc., so a regression that writes xMin= with an empty or NaN value still passes.
    • Fix: Parse the URL with new URL(page.url()).searchParams and assert each axis bound is a finite number.
    • correctness, kieran-typescript
🔵 P3 nitpicks (8)
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:88 -- toBeCloseTo(_, 0) only tolerates ±0.5 px on round-trip, which is tighter than the inline "small rounding from log-space conversion is acceptable" comment implies.
    • Fix: Relax to toBeCloseTo(_, -1) (±5 px) or use an explicit absolute tolerance.
  • packages/app/tests/e2e/page-objects/SearchPage.ts:327 -- the poll predicate calls locator.boundingBox() on .u-select, which has its own attach-timeout and can throw on the first iteration rather than returning the ?? 0 sentinel.
    • Fix: Wrap the call in .catch(() => null) so the optional chaining/fallback works regardless of attach state.
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:87 -- after reload the spec polls width but reads height as a one-shot boundingBox, so a y-scale that settles a frame later than x-scale flakes only the height assertion.
    • Fix: Poll both dimensions (e.g. Math.min(width, height) > 20) or split into two expect.poll calls.
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:16 -- the suite uses @search but other specs that depend on real traces data (total-count-matches-table.spec.ts, shared-filters.spec.ts) also carry @full-stack, and this spec needs traces data to render the heatmap.
    • Fix: Tag with ['@search', '@full-stack'] after confirming the local-mode CI lane can serve trace data for the deltas tab.
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:36 -- the > 20 "healthy" and < 5 "collapsed" thresholds are repeated across six assertions without named constants, making future tuning a multi-site edit.
    • Fix: Extract HEALTHY_SELECTION_MIN_PX = 20 and COLLAPSED_SELECTION_MAX_PX = 5 at the top of the spec.
  • packages/app/tests/e2e/page-objects/SearchPage.ts:303 -- openEventDeltasMode diverges from sibling mode-switch helpers switchToSQLMode / switchToLuceneMode.
    • Fix: Rename to switchToEventDeltasMode (or openEventDeltasTab if it is conceptually a tab, not a query mode).
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:45 -- expect(rect).not.toBeNull() does not type-narrow, so the rect!.height non-null assertions rely on convention.
    • Fix: Extract a small readSelectionBox helper that asserts non-null and returns a narrowed object, then drop the ! operators at lines 45, 88, and 146.
  • packages/app/tests/e2e/features/search/event-deltas-heatmap.spec.ts:78 -- after reload, getHeatmap().waitFor({ visible }) only proves the canvas is on screen, not that uPlot's ready hook has applied URL-derived u.select, so a wrong-sized-but-non-zero residue could pass before the eventual correct paint lands.
    • Fix: Poll for two consecutive identical boundingBox reads (stability check) before asserting the restored dimensions.

Reviewers (8): correctness, testing, maintainability, project-standards, agent-native, learnings-researcher, kieran-typescript, julik-frontend-races.

Testing gaps:

  • No assertion on rectangle left/top after reload — a regression that restores correct size at the wrong origin would pass.
  • Test 3 only checks xMin= disappears from the URL; xMax, yMin, yMax are not verified.
  • No coverage for zero-motion drag, drag that ends outside the canvas, or cold navigation to a URL with malformed xMin/xMax/yMin/yMax values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/tier-2 Low risk — AI review + quick human skim

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant