Skip to content

feat(power): measured-power multinode support (workers[] + per-stage joules)#405

Merged
arygupt merged 4 commits into
masterfrom
feat/measured-power-multinode-app
May 28, 2026
Merged

feat(power): measured-power multinode support (workers[] + per-stage joules)#405
arygupt merged 4 commits into
masterfrom
feat/measured-power-multinode-app

Conversation

@arygupt
Copy link
Copy Markdown
Collaborator

@arygupt arygupt commented May 28, 2026

What

App-side support for the InferenceX runner's multinode measured-power telemetry — the dashboard counterpart to SemiAnalysisAI/InferenceX#1574.

Surfaces, end-to-end (ingest → DB → transform → types):

  • Per-worker powerworkers[], each labeled prefill / decode with num_gpus, hosts[], and per-worker avg_power_w (+ optional temp/util/mem).
  • Per-stage joules (disagg)joules_per_input_token = prefill energy ÷ input; joules_per_output_token = decode energy ÷ output (per-stage, decode GPUs only). joules_per_total_token stays cluster-wide.
  • Cluster-wide telemetryavg_temp_c, peak_temp_c, avg_util_pct, avg_mem_used_mb.

Changes

  • db: migration 006_benchmark_results_workers.sql; ETL mapper + ingest + queries + json-provider carry workers[] and the new metric scalars.
  • constants: metric-keys lists the measured-power fields.
  • app: benchmark-transform + inference types surface the fields; unofficial-run route passes them through; cypress measured-power-overlay e2e.

Schema note (pairs with the runner change)

joules_per_output_token is now the per-stage decode value on disagg (symmetric with joules_per_input_token), matching the reviewer's spec on #1574. The old joules_per_output_token_decode key is removed (folded into joules_per_output_token) — it was only plumbed through, never rendered; the J/output chart already reads joules_per_output_token.

Validation

  • app 60 + db 223 unit tests pass
  • tsc --noEmit clean; lint (oxlint) 0 warnings/errors

Depends on

Runner PR SemiAnalysisAI/InferenceX#1574 (emits these agg-JSON fields). Land/coordinate together.

🤖 Generated with Claude Code


Note

Medium Risk
Adds a schema migration and materialized-view rebuild plus broad ETL/API contract changes; behavior is well-tested but must stay aligned with the companion runner PR and production migration order.

Overview
Adds end-to-end support for multinode / disaggregated measured-power telemetry from the runner: per-worker breakdowns, disagg role-split scalars, and cluster-wide temp/util/memory metrics.

Database & ingest: Migration 006 adds a dedicated benchmark_results.workers JSONB column (kept out of flat metrics), recreates latest_benchmarks, and extends bulk ingest upserts. ETL gains extractWorkers() with validation, excludes workers from numeric metrics, and registers new scalar keys in METRIC_KEYS.

Read path: Benchmark queries, json-provider dumps, API BenchmarkRow, unofficial-run normalization, and benchmark-transform all pass through workers[] plus scalars like prefill_avg_power_w, decode_avg_power_w, joules_per_input_token, and avg_temp_c—preserving undefined vs zero for legacy rows.

App types & tests: WorkerPower / extended AggDataEntry types document disagg semantics (joules_per_output_token as per-stage decode on disagg). Unit tests cover mapper, transform, and unofficial-run routes; Cypress verifies measured-energy Y-axis options on an unofficial-run overlay.

Reviewed by Cursor Bugbot for commit db9b15d. Bugbot is set up for automated code reviews on this repo. Configure here.

arygupt and others added 2 commits May 28, 2026 10:46
…orm layer

The multinode runner emits per-role power scalars (prefill_avg_power_w,
decode_avg_power_w, joules_per_input_token, joules_per_output_token_decode)
and a per-worker breakdown array on disagg runs. Surface them on
BenchmarkRow -> AggDataEntry -> InferenceData so chart features can consume
them without further transform changes. Workers ride as a sibling of the
scalar metrics dict to keep the existing Record<string, number> index intact.
App-side support for the InferenceX runner's multinode measured-power
telemetry: per-worker power breakdown (workers[] with role / num_gpus /
hosts), per-stage joules (input = prefill energy, output = decode energy),
and cluster-wide temp / util / mem.

- db: migration 006_benchmark_results_workers.sql; ETL mapper + ingest +
  queries + json-provider carry workers[] and the new metric scalars
- constants: metric-keys lists the measured-power fields; joules_per_output_token
  is per-stage decode on disagg (no separate _decode key)
- app: benchmark-transform + inference types surface the fields; unofficial-run
  route passes them through; cypress measured-power-overlay e2e
- drop joules_per_output_token_decode (folded into joules_per_output_token) to
  match the runner's per-stage change

Validated: app 60 + db 223 unit tests pass; tsc --noEmit clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@arygupt arygupt requested a review from adibarra as a code owner May 28, 2026 22:33
@vercel
Copy link
Copy Markdown

vercel Bot commented May 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment May 28, 2026 11:27pm

Request Review

@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

Claude finished @arygupt's task in 2m 58s —— View job


Claude Code Review

  • Read changed files
  • Analyze multinode transform / type changes
  • Check tests cover the new transform logic
  • Post review feedback

🟢 LGTM — no blocking issues found.

Reviewed the multinode measured-power plumbing end-to-end. The diff is clean and defensive at every JSON boundary:

  • extractWorkers (packages/db/src/etl/benchmark-mapper.ts:263) correctly distinguishes "missing field" from "field == 0" via parseNum/parseInt2 returning undefined only for unparseable values — the frontend worker with num_gpus: 0, avg_power_w: 0 case is exercised by the tests.
  • rowToAggDataEntry (packages/app/src/lib/benchmark-transform.ts:74) narrows row.workers with Array.isArray so a null JSONB column or a malformed payload survives without poisoning downstream consumers (verified by the 'oops' as any test).
  • bulkIngestBenchmarkRows encodes missing workers as JSON null so the heterogeneous jsonb[] unnest stores SQL NULL — correct for the new nullable column.
  • Migration 006_benchmark_results_workers.sql correctly drops + recreates the latest_benchmarks materialized view so br.* picks up the new column, and re-creates both indexes (incl. the (config_id, conc, isl, osl) unique index needed for REFRESH … CONCURRENTLY).
  • The unofficial-run path (route.ts:60) passes workers through symmetrically, and cypress/e2e/measured-power-overlay.cy.ts exercises the overlay code path against a real GHA run — satisfying CLAUDE.md's overlay-support requirement.

Test coverage is solid: dedicated unit tests for extractWorkers edge cases (non-array, empty, missing required fields, string-numeric coercion, malformed hosts), workers propagation through mapBenchmarkRow/rowToAggDataEntry/normalizeArtifactRows, plus the new role-split scalars and cluster-wide temp/util/mem fields.

Minor observation (non-blocking): the WorkerPower shape is now defined three times — packages/db/src/etl/benchmark-mapper.ts:80, packages/db/src/queries/benchmarks.ts:8 (as BenchmarkWorkerRow), and packages/app/src/components/inference/types.ts:35. They're structurally identical so TypeScript accepts cross-boundary assignment, but a future schema change will need to be made in three places. Worth consolidating into @semianalysisai/inferencex-constants at some point.
· feat/measured-power-multinode-app

@blacksmith-sh

This comment has been minimized.

…serting visibility

The measured-power-overlay e2e asserted `.should('be.visible')` on the gated
"Measured Energy" group + its options right after opening the Y-axis dropdown.
That list is a scroll container (searchable-select: max-h-72 overflow-y-auto)
and the group sits below the fold, so the items are in the DOM but clipped by
the overflow ancestor. Cypress `be.visible` does not auto-scroll, so it failed
deterministically on both the chrome and firefox e2e shard-2 jobs. The sibling
test passed only because `.click()` auto-scrolls its target into view.

Add `.scrollIntoView()` before each visibility assertion.

Verified locally: measured-power-overlay.cy.ts 2/2 passing against the real
run 26312107787 artifact with E2E_FIXTURES=1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

Claude finished @arygupt's task in 50s —— View job


Claude Code Review (Re-review)

  • Identify what changed since last review
  • Check the new commit
  • Verify test fix correctness
  • Post re-review summary

🟢 Re-review: LGTM — no blocking issues found.

Only one new commit since the previous review (9847aef): a focused Cypress fix that adds .scrollIntoView() before each should('be.visible') assertion on the gated "Measured Energy" dropdown group in packages/app/cypress/e2e/measured-power-overlay.cy.ts.

Root-cause analysis in the commit message matches the code:

  • The searchable-select list is a max-h-72 overflow-y-auto scroll container, so the new "Measured Energy" group renders below the fold.
  • Cypress be.visible does not auto-scroll, which is why the assertion failed deterministically on both chrome and firefox e2e shard-2 jobs reported by Blacksmith.
  • The second test passed because .click() does auto-scroll its target — consistent with the diagnosis.
  • The fix is the smallest possible: .scrollIntoView() chained before each .should('be.visible'). No production code touched.

Previous review feedback on the multinode telemetry plumbing (extractWorkers parsing, JSONB null handling, materialized-view recreation, overlay path symmetry) all still applies and remains addressed. The non-blocking observation about consolidating the WorkerPower shape into @semianalysisai/inferencex-constants is unchanged — still optional, not introduced by this commit.
· feat/measured-power-multinode-app

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9847aef. Configure here.

Comment thread packages/db/src/etl/benchmark-mapper.ts
…age dup

The read-side BenchmarkWorkerRow interface in queries/benchmarks.ts was a
verbatim redeclaration of the ingest-side WorkerPower in etl/benchmark-mapper.ts.
Alias it to the canonical definition so the per-worker shape can't drift within
the package, keeping the BenchmarkWorkerRow name its consumers reference.

Addresses the Cursor Bugbot "triplicate WorkerPower interface" finding on #405.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

Claude finished @arygupt's task in 52s —— View job


Claude Code Review (Re-review)

  • Identify what changed since last review
  • Verify refactor correctness
  • Post re-review summary

🟢 Re-review: LGTM — no blocking issues found.

One new commit since the previous review (db9b15d): a focused refactor that aliases BenchmarkWorkerRow to WorkerPower in packages/db/src/queries/benchmarks.ts:11, addressing the Cursor Bugbot "triplicate WorkerPower interface" finding (and the same observation in my prior review).

Verified:

  • The two in-package shapes were field-for-field identical (role, worker_idx, hosts?, num_gpus, avg_power_w, avg_temp_c?, peak_temp_c?, avg_util_pct?, avg_mem_used_mb?) — type BenchmarkWorkerRow = WorkerPower is a safe drop-in.
  • The public name BenchmarkWorkerRow is preserved, so external consumers (json-provider, BenchmarkRow.workers) are unaffected.
  • Scope is correctly limited to the in-db-package duplication. The third copy in packages/app/src/components/inference/types.ts stays — consolidating that would require moving the shape into @semianalysisai/inferencex-constants, which is a larger cross-package change appropriate for a follow-up.

All prior review findings on the multinode telemetry plumbing (extractWorkers parsing, JSONB null handling, materialized-view recreation, overlay path symmetry, test coverage) remain addressed and still apply.
· feat/measured-power-multinode-app

@arygupt arygupt merged commit 183f9b1 into master May 28, 2026
18 checks passed
@arygupt arygupt deleted the feat/measured-power-multinode-app branch May 28, 2026 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant