Conversation
Investigation design (like the I5 trace work): close the Rust-vs-Java label-flip gap on the two datasets where Rust trails upstream Java. Four levers, sequenced highest-leverage / lowest-risk first, each bench-gated on all three datasets with a revert path: - Phase 0: diagnostic (what instrument/tolerance/calibration each run actually resolves to) - Phase 1: config levers (instrument/tolerance like the iter20 win; calibration engagement) - Phase 2: label-flip hot-path fix (H2 peak-rank, then H3 log-prob) - Phase 3: additive PIN features (safety net) Stretch target +10% over current Rust framed as a direction, not a revert-all gate; ship every net-positive change under the hard gate.
…T ID rate Task-by-task plan for the design spec: Phase 0 diagnostic (prerequisite), Phase 1 config levers (instrument/tolerance + calibration, conditional on Phase 0), Phase 2 label-flip hot-path fix (H2 peak-rank then H3 log-prob, each with unit test + I5 trace re-run + Percolator bench gate + golden regen), Phase 3 additive PIN features (safety net), Task 4 close-out PR. Phase 2 code is intentionally an investigation protocol (pinned file:line, candidate rules, unit-test shape, bench gate) rather than a pre-written diff, since the exact tie-break/indexing edit is only knowable after the trace + Java-source read.
…low-res PXD001819 Phase 0 found precursor calibration was SKIPPED on PXD001819: of ~498 sampled spectra, only 193 yielded sub-1e-6 SpecEValue PSMs — just under the 200 firing threshold. Root cause: low-res ion-trap CID MS2 yields fewer high-confidence IDs from the fixed 500-spectrum sample. The cap on residuals used for the median shift and the minimum required to fire were the SAME constant (200), so the only ways to fire were to sample more (doubles prepass wall on ALL datasets — Astral 22s->45s, fails the speed gate) or lower the shared constant (would shrink the Astral/TMT estimate from 200 to 150 residuals too). Decouple them: keep RESIDUAL_CAP = 200 (Astral/TMT estimate unchanged, still uses the full best-200) and lower the firing minimum MIN_CONFIDENT_PSMS to 150 (PXD's 193 now clears it). Only PXD's behavior changes: skip -> fire. Zero extra prepass cost on any dataset. Local gates green: build, calibrator integration tests (5/5), clippy, and the cal=off bit-identical PIN gate (this only affects the cal=auto/on firing decision).
…tion Phase 2 investigation tooling. Rust assigns peak ranks ~2 higher than Java for the same spectrum (the label-flip root cause), and the existing per-ion trace can't show WHICH peaks drive the offset. Add a read-only `--dump-peaks` mode that prints the post-filter, post-deconvolution active peak list (rank, m/z, intensity) sorted by rank, plus a focused `ScoredSpectrum::dump_active_peaks()` accessor. No scoring change.
… to Java Using msgf-trace --dump-peaks to compare actual observed peaks (not theoretical-ion m/z), Rust and Java assign the identical rank to all 465 matched peaks on scan 41522 (offset +0, no exceptions). This debunks the I5 doc's central hypothesis that ~40% of the scoring divergence is peak-rank assignment (H2). That RANK_DIFF=301 count was an artifact of aligning by theoretical-ion m/z, which cross-matches different physical ions (Rust y/1 vs Java y/2+offset at coincidentally-equal m/z). Consequence: the peak-rank fix this whole investigation was built around does not exist (Rust already matches Java). Combined with Phase 0 (instrument /tolerance/calibration already correct) and the reviewer's BSA 217/217 top-1 parity, Rust's scoring is at/near parity with Java. The residual gap (PXD -1.1% after Phase 1b, TMT -5%) is small, on low-quality/SpecE-tail spectra, and not closable via scoring-parity fixes without Percolator regression. +10% over current Rust is not reachable on the scoring path.
ID-rate: calibration nudge (+53 PXD) + Phase-2 finding (peak-rank parity is exact)
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can customize the high-level summary generated by CodeRabbit.Configure the |
📝 WalkthroughWalkthroughThis PR adds peak rank diagnostic infrastructure to enable external rank verification, tunes precursor calibration constants for residual selection, documents Phase 2 parity findings showing identical rank assignment between Rust and Java, and defines a multi-phase performance investigation protocol with bench gates. ChangesPeak rank diagnostics and calibration tuning
Sequence Diagram(s)No sequence diagram generated. Changes are diagnostic feature additions and constant tuning without multi-component flows. Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/superpowers/specs/2026-05-28-id-rate-pxd001819-tmt-design.md (1)
35-39: ⚡ Quick winUpdate H2 assumptions in the design doc to match current parity evidence.
The design still treats H2 as a major active root cause, but the new Phase 2 parity note added in this PR concludes H2 is null under observed-peak matching. Aligning this section will keep follow-on work scoped to remaining validated hypotheses.
Also applies to: 136-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/specs/2026-05-28-id-rate-pxd001819-tmt-design.md` around lines 35 - 39, Update the H2 bullet and the corresponding paragraph under the "H2 — peak-rank assignment" heading to reflect the Phase 2 parity finding that H2 is null under observed-peak matching: change wording that treats H2 as an active root cause to state that observed-peak matching shows parity (H2 not responsible) and scope follow-on work to H1/H3 only; make the identical wording change in the second occurrence of this discussion (the section referenced as lines ~136-147) so both places consistently state H2 is null under observed-peak matching and reference the Phase 2 parity note.docs/superpowers/plans/2026-05-28-id-rate-pxd001819-tmt-plan.md (1)
181-262: ⚡ Quick winReconcile Phase 2 execution steps with the new H2-null finding.
This plan still operationalizes H2 tie-break changes as an active task, but the new parity note in this PR states rank assignment is already bit-identical. Please gate or retire Task 2a (and downstream 2b prerequisites) to avoid driving unnecessary hot-path edits and golden churn.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/plans/2026-05-28-id-rate-pxd001819-tmt-plan.md` around lines 181 - 262, The plan's Task 2a is now obsolete given the new H2-null parity note; update the checklist to gate or retire Task 2a (and any dependent Task 2b steps) so no code edits or golden regenerations are performed unless parity is proven non-identical. Edit the plan text to (1) mark "Task 2a: H2 peak-rank assignment" as gated/retired, (2) add a brief pointer to the PR parity note and the condition that must be met to re-open the task, and (3) remove or disable the operational steps that instruct changing nearest_peak_rank_in and ScoredSpectrum::new and regenerating goldens unless a failing parity trace proves a rank divergence; ensure references to nearest_peak_rank_in and ScoredSpectrum::new remain as rationale but not actionable steps.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/superpowers/plans/2026-05-28-id-rate-pxd001819-tmt-plan.md`:
- Around line 181-262: The plan's Task 2a is now obsolete given the new H2-null
parity note; update the checklist to gate or retire Task 2a (and any dependent
Task 2b steps) so no code edits or golden regenerations are performed unless
parity is proven non-identical. Edit the plan text to (1) mark "Task 2a: H2
peak-rank assignment" as gated/retired, (2) add a brief pointer to the PR parity
note and the condition that must be met to re-open the task, and (3) remove or
disable the operational steps that instruct changing nearest_peak_rank_in and
ScoredSpectrum::new and regenerating goldens unless a failing parity trace
proves a rank divergence; ensure references to nearest_peak_rank_in and
ScoredSpectrum::new remain as rationale but not actionable steps.
In `@docs/superpowers/specs/2026-05-28-id-rate-pxd001819-tmt-design.md`:
- Around line 35-39: Update the H2 bullet and the corresponding paragraph under
the "H2 — peak-rank assignment" heading to reflect the Phase 2 parity finding
that H2 is null under observed-peak matching: change wording that treats H2 as
an active root cause to state that observed-peak matching shows parity (H2 not
responsible) and scope follow-on work to H1/H3 only; make the identical wording
change in the second occurrence of this discussion (the section referenced as
lines ~136-147) so both places consistently state H2 is null under observed-peak
matching and reference the Phase 2 parity note.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 84b07f2d-6d70-4c28-84d8-d281b3300453
📒 Files selected for processing (8)
.gitignorecrates/msgf-rust/src/bin/msgf-trace.rscrates/scoring/src/scoring/scored_spectrum.rscrates/search/src/mass_calibrator.rscrates/search/src/precursor_cal.rsdocs/parity-analysis/notes/2026-05-28-phase2-peak-rank-parity.mddocs/superpowers/plans/2026-05-28-id-rate-pxd001819-tmt-plan.mddocs/superpowers/specs/2026-05-28-id-rate-pxd001819-tmt-design.md
Summary by CodeRabbit
Release Notes
New Features
--dump-peaksdiagnostic flag to trace utility for peak information output.Documentation
Chores