Observed on PR #1701 (release v3.15.0)
The Pre-publish benchmark gate failed on the KNOWN_REGRESSIONS entries are not stale test (tests/benchmarks/regression-guard.test.ts:691): 11 entries keyed to 3.12.0/3.13.0 were flagged as >1 minor version behind the bumped package.json version (3.15.0).
Root cause
The committed benchmark baseline data (generated/benchmarks/*.md) was stuck at 3.13.0 — there was no docs: update performance benchmarks (3.14.0) PR (data jumps 3.13.0 → nothing, even though a v3.14.0 tag exists). Benchmark recording happens post-publish via benchmark.yml (triggered on the Publish workflow's workflow_run), so:
- The release PR bumps
package.json to the new version immediately.
- The staleness guard measures entry age against
package.json, not against the latest recorded baseline.
- When the package version is ≥2 minors ahead of the latest recorded baseline, every still-live exemption keyed to the previous baseline is flagged stale — even though it's still the active baseline for the dev-vs-baseline comparison.
This made the failure look like a routine 'prune stale entries' chore, but naively deleting the entries before the new baseline lands would have unmasked the dev-vs-old-baseline comparisons instead.
Why 3.14.0's recording is missing
Worth confirming whether 3.14.0 was published through publish.yml (which should have triggered benchmark.yml) and, if so, why no chore/bench-v3.14.0-* PR was created/merged.
Suggested fixes (pick one)
- Measure staleness against the latest recorded benchmark version, not
package.json — the guard should only consider an entry stale once a newer baseline that supersedes it has actually landed.
- Allow a grace window (e.g. entries within
MAX_VERSION_GAP minors of the latest recorded baseline are not stale), so a release PR doesn't fail before its post-publish benchmark PR merges.
- Ensure the post-publish benchmark-recording PR is created/merged for every release (and investigate the missing 3.14.0 one) so the baseline never lags the package version by >1 minor.
Immediate unblock (already applied on #1701)
Merged the 3.15.0 benchmark data (PR #1702) into the release branch, then pruned the now-superseded 3.12.0/3.13.0 entries (commit 65f9ce7). Tracking the systemic fix here.
Observed on PR #1701 (release v3.15.0)
The
Pre-publish benchmark gatefailed on theKNOWN_REGRESSIONS entries are not staletest (tests/benchmarks/regression-guard.test.ts:691): 11 entries keyed to 3.12.0/3.13.0 were flagged as >1 minor version behind the bumpedpackage.jsonversion (3.15.0).Root cause
The committed benchmark baseline data (
generated/benchmarks/*.md) was stuck at 3.13.0 — there was nodocs: update performance benchmarks (3.14.0)PR (data jumps 3.13.0 → nothing, even though av3.14.0tag exists). Benchmark recording happens post-publish viabenchmark.yml(triggered on the Publish workflow'sworkflow_run), so:package.jsonto the new version immediately.package.json, not against the latest recorded baseline.This made the failure look like a routine 'prune stale entries' chore, but naively deleting the entries before the new baseline lands would have unmasked the dev-vs-old-baseline comparisons instead.
Why 3.14.0's recording is missing
Worth confirming whether 3.14.0 was published through
publish.yml(which should have triggeredbenchmark.yml) and, if so, why nochore/bench-v3.14.0-*PR was created/merged.Suggested fixes (pick one)
package.json— the guard should only consider an entry stale once a newer baseline that supersedes it has actually landed.MAX_VERSION_GAPminors of the latest recorded baseline are not stale), so a release PR doesn't fail before its post-publish benchmark PR merges.Immediate unblock (already applied on #1701)
Merged the 3.15.0 benchmark data (PR #1702) into the release branch, then pruned the now-superseded 3.12.0/3.13.0 entries (commit 65f9ce7). Tracking the systemic fix here.