ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact) by ypriverol · Pull Request #40 · bigbio/msgf-rust

ypriverol · 2026-05-28T08:46:00Z

Summary

Outcome of the PXD001819 + TMT ID-rate investigation. The headline is a diagnostic conclusion plus one small, zero-risk win:

Phase 1b — calibration fires on PXD001819 (+53 PSMs @1% FDR, 14,755→14,808, flat wall). Decouples the residual cap (200, unchanged for Astral/TMT) from the firing minimum (lowered 200→150) so low-res ion-trap CID datasets — which yield fewer sub-1e-6 SpecEValue PSMs — can calibrate. Astral/TMT behavior is byte-identical (they still use the full best-200 estimate).
Phase 2 — peak-rank assignment is bit-identical to Java. Using a new msgf-trace --dump-peaks diagnostic, Rust and Java assign the identical rank to all 465 matched peaks on the canonical I5 label-flip scan (offset +0, no exceptions). This debunks the I5 doc's central hypothesis (H2 = ~40% of divergence): that RANK_DIFF=301 was an alignment artifact (theoretical-ion m/z cross-matching y/1 vs y/2+offset), not a real rank bug.

Why this matters

Combined with Phase 0 (instrument/tolerance/calibration already correctly resolved on both datasets) and the external review's "BSA 217/217 top-1 parity", the investigation establishes that Rust's scoring is at/near parity with Java. The residual gap (PXD −1.1% after this PR, TMT −5%) is small, sits on low-quality / SpecE-tail spectra, and per the n=9+ project audit regresses Percolator when "fixed" individually. Further ID gain is not reachable on the scoring-parity path — it needs algorithmic changes (tracked separately).

Commits

Commit	What
`docs(spec)`	Phased bench-gated design for the investigation
`docs(plan)`	Implementation plan (Phase 0 diagnostic → config → label-flip → additive)
`tune(cal)`	Phase 1b: decouple residual cap from firing minimum → calibration fires on PXD001819 (+53)
`diag(trace)`	`msgf-trace --dump-peaks` + `ScoredSpectrum::dump_active_peaks()` (read-only diagnostic)
`docs(parity)`	Phase 2 finding: peak ranks bit-identical to Java; redirects future ID-rate work

Bench (Phase 1b, PXD001819 cal=auto)

	PSMs @1% FDR	Wall
Baseline (master)	14,755	0:54.36
This PR	14,808	0:54.01

Calibration now fires (learned +0.094 ppm from 193 confident PSMs). Astral/TMT unchanged by construction (cap stays 200; they had ≥200 confident).

Verification

cargo clippy --workspace --all-targets -- -D warnings clean
cal=off bit-identical PIN gate green (this only changes the cal=auto/on firing decision)
calibrator integration tests green
Phase 1b benched on the VM (numbers above); peak-rank parity proven via --dump-peaks
CI matrix (Lint + 3 OS + CodeRabbit)

Out of scope (next: algorithmic roadmap)

Scoring-parity is exhausted. Real ID gains need algorithmic work (richer rescoring features, candidate-generation changes, chimeric-spectrum handling, predicted-intensity features) — being planned separately, not parity tweaks.

Investigation design (like the I5 trace work): close the Rust-vs-Java label-flip gap on the two datasets where Rust trails upstream Java. Four levers, sequenced highest-leverage / lowest-risk first, each bench-gated on all three datasets with a revert path: - Phase 0: diagnostic (what instrument/tolerance/calibration each run actually resolves to) - Phase 1: config levers (instrument/tolerance like the iter20 win; calibration engagement) - Phase 2: label-flip hot-path fix (H2 peak-rank, then H3 log-prob) - Phase 3: additive PIN features (safety net) Stretch target +10% over current Rust framed as a direction, not a revert-all gate; ship every net-positive change under the hard gate.

…T ID rate Task-by-task plan for the design spec: Phase 0 diagnostic (prerequisite), Phase 1 config levers (instrument/tolerance + calibration, conditional on Phase 0), Phase 2 label-flip hot-path fix (H2 peak-rank then H3 log-prob, each with unit test + I5 trace re-run + Percolator bench gate + golden regen), Phase 3 additive PIN features (safety net), Task 4 close-out PR. Phase 2 code is intentionally an investigation protocol (pinned file:line, candidate rules, unit-test shape, bench gate) rather than a pre-written diff, since the exact tie-break/indexing edit is only knowable after the trace + Java-source read.

…low-res PXD001819 Phase 0 found precursor calibration was SKIPPED on PXD001819: of ~498 sampled spectra, only 193 yielded sub-1e-6 SpecEValue PSMs — just under the 200 firing threshold. Root cause: low-res ion-trap CID MS2 yields fewer high-confidence IDs from the fixed 500-spectrum sample. The cap on residuals used for the median shift and the minimum required to fire were the SAME constant (200), so the only ways to fire were to sample more (doubles prepass wall on ALL datasets — Astral 22s->45s, fails the speed gate) or lower the shared constant (would shrink the Astral/TMT estimate from 200 to 150 residuals too). Decouple them: keep RESIDUAL_CAP = 200 (Astral/TMT estimate unchanged, still uses the full best-200) and lower the firing minimum MIN_CONFIDENT_PSMS to 150 (PXD's 193 now clears it). Only PXD's behavior changes: skip -> fire. Zero extra prepass cost on any dataset. Local gates green: build, calibrator integration tests (5/5), clippy, and the cal=off bit-identical PIN gate (this only affects the cal=auto/on firing decision).

…tion Phase 2 investigation tooling. Rust assigns peak ranks ~2 higher than Java for the same spectrum (the label-flip root cause), and the existing per-ion trace can't show WHICH peaks drive the offset. Add a read-only `--dump-peaks` mode that prints the post-filter, post-deconvolution active peak list (rank, m/z, intensity) sorted by rank, plus a focused `ScoredSpectrum::dump_active_peaks()` accessor. No scoring change.

… to Java Using msgf-trace --dump-peaks to compare actual observed peaks (not theoretical-ion m/z), Rust and Java assign the identical rank to all 465 matched peaks on scan 41522 (offset +0, no exceptions). This debunks the I5 doc's central hypothesis that ~40% of the scoring divergence is peak-rank assignment (H2). That RANK_DIFF=301 count was an artifact of aligning by theoretical-ion m/z, which cross-matches different physical ions (Rust y/1 vs Java y/2+offset at coincidentally-equal m/z). Consequence: the peak-rank fix this whole investigation was built around does not exist (Rust already matches Java). Combined with Phase 0 (instrument /tolerance/calibration already correct) and the reviewer's BSA 217/217 top-1 parity, Rust's scoring is at/near parity with Java. The residual gap (PXD -1.1% after Phase 1b, TMT -5%) is small, on low-quality/SpecE-tail spectra, and not closable via scoring-parity fixes without Percolator regression. +10% over current Rust is not reachable on the scoring path.

qodo-code-review · 2026-05-28T08:46:05Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-05-28T08:46:08Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d684fa7e-0a77-4a04-95fb-ed3e6642846d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/id-rate-pxd001819-tmt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ypriverol added 5 commits May 28, 2026 09:45

ypriverol merged commit ed854fb into dev May 28, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)#40

ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)#40
ypriverol merged 5 commits into
devfrom
feat/id-rate-pxd001819-tmt

ypriverol commented May 28, 2026

Uh oh!

qodo-code-review Bot commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ypriverol commented May 28, 2026

Summary

Why this matters

Commits

Bench (Phase 1b, PXD001819 cal=auto)

Verification

Out of scope (next: algorithmic roadmap)

Uh oh!

qodo-code-review Bot commented May 28, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant