Skip to content

ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)#40

Merged
ypriverol merged 5 commits into
devfrom
feat/id-rate-pxd001819-tmt
May 28, 2026
Merged

ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)#40
ypriverol merged 5 commits into
devfrom
feat/id-rate-pxd001819-tmt

Conversation

@ypriverol
Copy link
Copy Markdown
Member

Summary

Outcome of the PXD001819 + TMT ID-rate investigation. The headline is a diagnostic conclusion plus one small, zero-risk win:

  1. Phase 1b — calibration fires on PXD001819 (+53 PSMs @1% FDR, 14,755→14,808, flat wall). Decouples the residual cap (200, unchanged for Astral/TMT) from the firing minimum (lowered 200→150) so low-res ion-trap CID datasets — which yield fewer sub-1e-6 SpecEValue PSMs — can calibrate. Astral/TMT behavior is byte-identical (they still use the full best-200 estimate).
  2. Phase 2 — peak-rank assignment is bit-identical to Java. Using a new msgf-trace --dump-peaks diagnostic, Rust and Java assign the identical rank to all 465 matched peaks on the canonical I5 label-flip scan (offset +0, no exceptions). This debunks the I5 doc's central hypothesis (H2 = ~40% of divergence): that RANK_DIFF=301 was an alignment artifact (theoretical-ion m/z cross-matching y/1 vs y/2+offset), not a real rank bug.

Why this matters

Combined with Phase 0 (instrument/tolerance/calibration already correctly resolved on both datasets) and the external review's "BSA 217/217 top-1 parity", the investigation establishes that Rust's scoring is at/near parity with Java. The residual gap (PXD −1.1% after this PR, TMT −5%) is small, sits on low-quality / SpecE-tail spectra, and per the n=9+ project audit regresses Percolator when "fixed" individually. Further ID gain is not reachable on the scoring-parity path — it needs algorithmic changes (tracked separately).

Commits

Commit What
docs(spec) Phased bench-gated design for the investigation
docs(plan) Implementation plan (Phase 0 diagnostic → config → label-flip → additive)
tune(cal) Phase 1b: decouple residual cap from firing minimum → calibration fires on PXD001819 (+53)
diag(trace) msgf-trace --dump-peaks + ScoredSpectrum::dump_active_peaks() (read-only diagnostic)
docs(parity) Phase 2 finding: peak ranks bit-identical to Java; redirects future ID-rate work

Bench (Phase 1b, PXD001819 cal=auto)

PSMs @1% FDR Wall
Baseline (master) 14,755 0:54.36
This PR 14,808 0:54.01

Calibration now fires (learned +0.094 ppm from 193 confident PSMs). Astral/TMT unchanged by construction (cap stays 200; they had ≥200 confident).

Verification

  • cargo clippy --workspace --all-targets -- -D warnings clean
  • cal=off bit-identical PIN gate green (this only changes the cal=auto/on firing decision)
  • calibrator integration tests green
  • Phase 1b benched on the VM (numbers above); peak-rank parity proven via --dump-peaks
  • CI matrix (Lint + 3 OS + CodeRabbit)

Out of scope (next: algorithmic roadmap)

Scoring-parity is exhausted. Real ID gains need algorithmic work (richer rescoring features, candidate-generation changes, chimeric-spectrum handling, predicted-intensity features) — being planned separately, not parity tweaks.

ypriverol added 5 commits May 28, 2026 09:45
Investigation design (like the I5 trace work): close the Rust-vs-Java
label-flip gap on the two datasets where Rust trails upstream Java.

Four levers, sequenced highest-leverage / lowest-risk first, each
bench-gated on all three datasets with a revert path:
 - Phase 0: diagnostic (what instrument/tolerance/calibration each run
   actually resolves to)
 - Phase 1: config levers (instrument/tolerance like the iter20 win;
   calibration engagement)
 - Phase 2: label-flip hot-path fix (H2 peak-rank, then H3 log-prob)
 - Phase 3: additive PIN features (safety net)

Stretch target +10% over current Rust framed as a direction, not a
revert-all gate; ship every net-positive change under the hard gate.
…T ID rate

Task-by-task plan for the design spec: Phase 0 diagnostic (prerequisite),
Phase 1 config levers (instrument/tolerance + calibration, conditional on
Phase 0), Phase 2 label-flip hot-path fix (H2 peak-rank then H3 log-prob,
each with unit test + I5 trace re-run + Percolator bench gate + golden
regen), Phase 3 additive PIN features (safety net), Task 4 close-out PR.

Phase 2 code is intentionally an investigation protocol (pinned file:line,
candidate rules, unit-test shape, bench gate) rather than a pre-written
diff, since the exact tie-break/indexing edit is only knowable after the
trace + Java-source read.
…low-res PXD001819

Phase 0 found precursor calibration was SKIPPED on PXD001819: of ~498
sampled spectra, only 193 yielded sub-1e-6 SpecEValue PSMs — just under
the 200 firing threshold. Root cause: low-res ion-trap CID MS2 yields
fewer high-confidence IDs from the fixed 500-spectrum sample.

The cap on residuals used for the median shift and the minimum required
to fire were the SAME constant (200), so the only ways to fire were to
sample more (doubles prepass wall on ALL datasets — Astral 22s->45s,
fails the speed gate) or lower the shared constant (would shrink the
Astral/TMT estimate from 200 to 150 residuals too).

Decouple them: keep RESIDUAL_CAP = 200 (Astral/TMT estimate unchanged,
still uses the full best-200) and lower the firing minimum
MIN_CONFIDENT_PSMS to 150 (PXD's 193 now clears it). Only PXD's behavior
changes: skip -> fire. Zero extra prepass cost on any dataset.

Local gates green: build, calibrator integration tests (5/5), clippy,
and the cal=off bit-identical PIN gate (this only affects the cal=auto/on
firing decision).
…tion

Phase 2 investigation tooling. Rust assigns peak ranks ~2 higher than
Java for the same spectrum (the label-flip root cause), and the existing
per-ion trace can't show WHICH peaks drive the offset. Add a read-only
`--dump-peaks` mode that prints the post-filter, post-deconvolution active
peak list (rank, m/z, intensity) sorted by rank, plus a focused
`ScoredSpectrum::dump_active_peaks()` accessor. No scoring change.
… to Java

Using msgf-trace --dump-peaks to compare actual observed peaks (not
theoretical-ion m/z), Rust and Java assign the identical rank to all 465
matched peaks on scan 41522 (offset +0, no exceptions).

This debunks the I5 doc's central hypothesis that ~40% of the scoring
divergence is peak-rank assignment (H2). That RANK_DIFF=301 count was an
artifact of aligning by theoretical-ion m/z, which cross-matches different
physical ions (Rust y/1 vs Java y/2+offset at coincidentally-equal m/z).

Consequence: the peak-rank fix this whole investigation was built around
does not exist (Rust already matches Java). Combined with Phase 0 (instrument
/tolerance/calibration already correct) and the reviewer's BSA 217/217 top-1
parity, Rust's scoring is at/near parity with Java. The residual gap
(PXD -1.1% after Phase 1b, TMT -5%) is small, on low-quality/SpecE-tail
spectra, and not closable via scoring-parity fixes without Percolator
regression. +10% over current Rust is not reachable on the scoring path.
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d684fa7e-0a77-4a04-95fb-ed3e6642846d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/id-rate-pxd001819-tmt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ypriverol ypriverol merged commit ed854fb into dev May 28, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant