ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)#40
Conversation
Investigation design (like the I5 trace work): close the Rust-vs-Java label-flip gap on the two datasets where Rust trails upstream Java. Four levers, sequenced highest-leverage / lowest-risk first, each bench-gated on all three datasets with a revert path: - Phase 0: diagnostic (what instrument/tolerance/calibration each run actually resolves to) - Phase 1: config levers (instrument/tolerance like the iter20 win; calibration engagement) - Phase 2: label-flip hot-path fix (H2 peak-rank, then H3 log-prob) - Phase 3: additive PIN features (safety net) Stretch target +10% over current Rust framed as a direction, not a revert-all gate; ship every net-positive change under the hard gate.
…T ID rate Task-by-task plan for the design spec: Phase 0 diagnostic (prerequisite), Phase 1 config levers (instrument/tolerance + calibration, conditional on Phase 0), Phase 2 label-flip hot-path fix (H2 peak-rank then H3 log-prob, each with unit test + I5 trace re-run + Percolator bench gate + golden regen), Phase 3 additive PIN features (safety net), Task 4 close-out PR. Phase 2 code is intentionally an investigation protocol (pinned file:line, candidate rules, unit-test shape, bench gate) rather than a pre-written diff, since the exact tie-break/indexing edit is only knowable after the trace + Java-source read.
…low-res PXD001819 Phase 0 found precursor calibration was SKIPPED on PXD001819: of ~498 sampled spectra, only 193 yielded sub-1e-6 SpecEValue PSMs — just under the 200 firing threshold. Root cause: low-res ion-trap CID MS2 yields fewer high-confidence IDs from the fixed 500-spectrum sample. The cap on residuals used for the median shift and the minimum required to fire were the SAME constant (200), so the only ways to fire were to sample more (doubles prepass wall on ALL datasets — Astral 22s->45s, fails the speed gate) or lower the shared constant (would shrink the Astral/TMT estimate from 200 to 150 residuals too). Decouple them: keep RESIDUAL_CAP = 200 (Astral/TMT estimate unchanged, still uses the full best-200) and lower the firing minimum MIN_CONFIDENT_PSMS to 150 (PXD's 193 now clears it). Only PXD's behavior changes: skip -> fire. Zero extra prepass cost on any dataset. Local gates green: build, calibrator integration tests (5/5), clippy, and the cal=off bit-identical PIN gate (this only affects the cal=auto/on firing decision).
…tion Phase 2 investigation tooling. Rust assigns peak ranks ~2 higher than Java for the same spectrum (the label-flip root cause), and the existing per-ion trace can't show WHICH peaks drive the offset. Add a read-only `--dump-peaks` mode that prints the post-filter, post-deconvolution active peak list (rank, m/z, intensity) sorted by rank, plus a focused `ScoredSpectrum::dump_active_peaks()` accessor. No scoring change.
… to Java Using msgf-trace --dump-peaks to compare actual observed peaks (not theoretical-ion m/z), Rust and Java assign the identical rank to all 465 matched peaks on scan 41522 (offset +0, no exceptions). This debunks the I5 doc's central hypothesis that ~40% of the scoring divergence is peak-rank assignment (H2). That RANK_DIFF=301 count was an artifact of aligning by theoretical-ion m/z, which cross-matches different physical ions (Rust y/1 vs Java y/2+offset at coincidentally-equal m/z). Consequence: the peak-rank fix this whole investigation was built around does not exist (Rust already matches Java). Combined with Phase 0 (instrument /tolerance/calibration already correct) and the reviewer's BSA 217/217 top-1 parity, Rust's scoring is at/near parity with Java. The residual gap (PXD -1.1% after Phase 1b, TMT -5%) is small, on low-quality/SpecE-tail spectra, and not closable via scoring-parity fixes without Percolator regression. +10% over current Rust is not reachable on the scoring path.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Outcome of the PXD001819 + TMT ID-rate investigation. The headline is a diagnostic conclusion plus one small, zero-risk win:
+53PSMs @1% FDR, 14,755→14,808, flat wall). Decouples the residual cap (200, unchanged for Astral/TMT) from the firing minimum (lowered 200→150) so low-res ion-trap CID datasets — which yield fewer sub-1e-6 SpecEValue PSMs — can calibrate. Astral/TMT behavior is byte-identical (they still use the full best-200 estimate).msgf-trace --dump-peaksdiagnostic, Rust and Java assign the identical rank to all 465 matched peaks on the canonical I5 label-flip scan (offset +0, no exceptions). This debunks the I5 doc's central hypothesis (H2 = ~40% of divergence): that RANK_DIFF=301 was an alignment artifact (theoretical-ion m/z cross-matchingy/1vsy/2+offset), not a real rank bug.Why this matters
Combined with Phase 0 (instrument/tolerance/calibration already correctly resolved on both datasets) and the external review's "BSA 217/217 top-1 parity", the investigation establishes that Rust's scoring is at/near parity with Java. The residual gap (PXD −1.1% after this PR, TMT −5%) is small, sits on low-quality / SpecE-tail spectra, and per the n=9+ project audit regresses Percolator when "fixed" individually. Further ID gain is not reachable on the scoring-parity path — it needs algorithmic changes (tracked separately).
Commits
docs(spec)docs(plan)tune(cal)diag(trace)msgf-trace --dump-peaks+ScoredSpectrum::dump_active_peaks()(read-only diagnostic)docs(parity)Bench (Phase 1b, PXD001819 cal=auto)
Calibration now fires (learned +0.094 ppm from 193 confident PSMs). Astral/TMT unchanged by construction (cap stays 200; they had ≥200 confident).
Verification
cargo clippy --workspace --all-targets -- -D warningsclean--dump-peaksOut of scope (next: algorithmic roadmap)
Scoring-parity is exhausted. Real ID gains need algorithmic work (richer rescoring features, candidate-generation changes, chimeric-spectrum handling, predicted-intensity features) — being planned separately, not parity tweaks.