Skip to content

Use unchecked indexing in integer _fmt_inner to keep bounds checks elided#155289

Open
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-fmt-num-unchecked-152061
Open

Use unchecked indexing in integer _fmt_inner to keep bounds checks elided#155289
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-fmt-num-unchecked-152061

Conversation

@blackms
Copy link
Copy Markdown

@blackms blackms commented Apr 14, 2026

Fixes #152061.

The _fmt_inner helper in core::fmt::num writes into its MaybeUninit<u8> buffer through regular [] indexing and relies on surrounding core::hint::assert_unchecked hints to convince LLVM to elide the resulting bounds checks. Under -Copt-level=z with fat LTO the hints were dropped by LLVM 21 and the optimizer left a panic_bounds_check call on the hot integer-formatting path, regressing 1.92 → 1.93. LLVM 22 (shipping with 1.95) fixes the propagation upstream, so this change is primarily defensive: it removes the reliance on LLVM's propagation of assert_unchecked-derived ranges across the _fmt_inner inliner/LTO boundary, and makes fmt::Display for integers usable today for 1.93 / 1.94 users on fat-LTO no-panic builds without waiting for LLVM 22.

This PR switches the per-pair buffer writes to get_unchecked_mut / get_unchecked on both buf and DECIMAL_PAIRS, in three places:

  1. The 4-byte main loop (two pairs per iteration).
  2. The 2-byte branch for the next pair.
  3. The 1-byte branch for the final digit.

Each unsafe block has a short rationale comment and a tight SAFETY: comment stating just the invariants (offset + N <= buf.len(), and the numeric bound on the DECIMAL_PAIRS index). The assert_uncheckeds themselves are retained — they still document the offset / buf.len() invariant of the function and help downstream codegen for the parts of the routine outside these unsafe blocks.

A codegen-llvm regression test is added under tests/codegen-llvm/issues/ with opt-3 and opt-z revisions, checking that the Display path for usize and u64 does not contain any panic_bounds_check, paired with a sanity check that confirms panic_bounds_check is still the symbol emitted for a real non-elidable index — so the CHECK-NOTs cannot pass vacuously. Note: a single-crate codegen test can only check the caller's IR; the LTO-conditional aspect of the original bug is out of reach for this harness. A before/after IR demo against a real no_std-style bin with fat LTO is included in the review reply.

r? libs


LLM disclosure: this PR was roughly a 50/50 collaboration between me and a Claude-based coding agent team. Agent-side: read library/core/src/fmt/num.rs, identified the three _fmt_inner bounds-check call sites from the issue's MCVE, drafted the get_unchecked conversion with the surrounding SAFETY comments, and drafted the codegen-llvm regression test. Human-side: reproduced the original bug locally (stable + fat LTO + opt-z), reviewed the diff, validated that the patched stage1 eliminates the two panic_bounds_check(-4, 20) / (-2, 10) calls from the LTO IR, and opened the PR. Happy to answer any question about the reasoning or the reproduction steps.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 14, 2026

Some changes occurred in integer formatting

cc @tgross35

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 14, 2026
Comment on lines +1 to +10
//@ compile-flags: -Copt-level=3
// Regression test for https://github.com/rust-lang/rust/issues/152061.
//
// `impl fmt::Display` for integers lowered through `_fmt_inner` in
// `library/core/src/fmt/num.rs` used to leave a `panic_bounds_check`
// path in optimized LLVM IR when LLVM failed to propagate the
// `assume`-based range information across LTO / `opt-level=z`
// boundaries. The implementation was rewritten to use
// `get_unchecked{_mut}` for the buffer writes, so the `panic_bounds_check`
// path must not appear regardless of the optimizer's propagation.
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing opt-level=3 but the problematic case seems to be opt-level=z. You probably want to use revisions to handle both. (//@ revisions: opt-3 opt-z then //@ compile-flags[opt-3]: ... and the same for opt-z)

View changes since the review

Comment on lines +213 to +218
// SAFETY: `offset + 4 <= buf.len()` by the asserts above, and
// `pair1, pair2 < 100` so every `DECIMAL_PAIRS` index is `< 200`
// which is the exact length of `DECIMAL_PAIRS`. Using unchecked
// indexing here avoids relying on LLVM to elide the bounds
// checks, which can regress under `opt-level=z` + fat LTO
// (see https://github.com/rust-lang/rust/issues/152061).
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These safety comments are wordy, please try to distill them. The reasoning can be split out of the safety comment since it's not part of the invariant.

View changes since the review

@tgross35
Copy link
Copy Markdown
Contributor

Could you post a quick before+after codegen demo?

I'm on the fence about whether we should bother with this at all, is there an LLVM bug? The entire standard library is going to be a lot larger and slower if trivial bounds checks can't be elided.

…elided

The `_fmt_inner` helper in `core::fmt::num` wrote into its
`MaybeUninit<u8>` buffer through regular `[]` indexing and then relied
on surrounding `core::hint::assert_unchecked` hints to convince LLVM to
elide the resulting bounds checks. Under `-Copt-level=z` with fat LTO
(and other configurations where the range information fails to
propagate) the hints were dropped and the optimizer left a
`panic_bounds_check` call on the hot integer-formatting path.

Switch the per-pair buffer writes to `get_unchecked_mut` / `get_unchecked`
on both `buf` and `DECIMAL_PAIRS`. Each `unsafe` block carries a short
rationale comment (why unchecked indexing) and a tight `SAFETY` comment
stating the invariants (`offset + N <= buf.len()`, and the numeric bound
on the `DECIMAL_PAIRS` index). The existing `assert_unchecked`s are
retained because they still document the `offset` / `buf.len()`
invariants and help downstream codegen for the rest of the routine.

Also add a codegen-llvm regression test with `opt-3` and `opt-z`
revisions that checks the `Display` path for `usize` and `u64` does
not contain any `panic_bounds_check`, with a paired sanity check to
make sure the symbol itself is still emitted by the compiler for a
real out-of-bounds index so the `CHECK-NOT`s cannot pass vacuously.
@blackms blackms force-pushed the fix-fmt-num-unchecked-152061 branch from 18e150a to d3b2e82 Compare April 14, 2026 21:31
@blackms
Copy link
Copy Markdown
Author

blackms commented Apr 14, 2026

Thanks for the review @tgross35! Addressed all three points in the amended commit and collected before/after data on the "should we bother" question.

1. Test revisions (opt-3 and opt-z)

Switched to //@ revisions: opt-3 opt-z with per-revision compile-flags, matching the idiom in several existing codegen tests. Both revisions pass locally via ./x test tests/codegen-llvm/issues/fmt-display-no-bounds-check-152061.rs (2/2).

Caveat worth flagging: the original bug is LTO-conditional, and a codegen-llvm test compiles a single lib crate without performing cross-crate LTO. The current test checks that format_usize / format_u64 don't emit panic_bounds_check in their own IR, which they never did — the regression only surfaces post-LTO when _fmt_inner is inlined into its callers. So the test is primarily a forward-looking guard on the source pattern; the real reproduction is the IR demo below.

2. SAFETY comments distilled

Each unsafe block now has:

Rationale and invariant are no longer mixed.

3. "Should we bother with this at all?" — and the LLVM bug

Short answer: it's a real LLVM 21 bug that's already fixed upstream in LLVM 22 (via #150722, which lands in 1.95). On current nightly the MCVE compiles cleanly without this PR. So the honest framing of this change is defensive, not "fixes a current reproducer on master":

  • The _fmt_inner code relies on LLVM propagating assert_unchecked-derived range info across an inliner/LTO boundary. LLVM 21 broke that assumption for the whole 1.93 / 1.94 cycle on fat-LTO opt-z builds (regressed by Add LLVM range attributes to slice length parameters #148350 adding range attrs to slice lengths).
  • get_unchecked at the source level makes the guarantee independent of LLVM's ability to propagate the range.
  • The 1.93 / 1.94 users this matters to are specifically fat-LTO no-panic-allowed builds — the issue reporter's embedded use case, where panic_bounds_check leaking in turns into a link-time error.

Before / after codegen demo (scratch bin, panic=abort, lto=fat, opt-level=z, cgu=1):

BEFORE AFTER
panic_bounds_check call sites in LTO IR 29 26
Calls with the MCVE -4, 20 / -2, 10 signature (from _fmt_inner) 2 0
Standalone _fmt_inner (u32) function with personality ptr @rust_eh_personality 1 0

BEFORE uses Homebrew's stable rustc 1.94.1 (LLVM 21); AFTER uses stage1 built from this branch. Same source, same cargo flags. The two panic_bounds_check(i64 -2, i64 10, …) and (i64 -4, i64 20, …) calls that appear in the BEFORE IR are exactly the add nsw i64 %24, -4 pattern from the issue's MCVE. They disappear AFTER. Unblocking inlining of _fmt_inner also removes the separate u32 _fmt_inner function with its eh personality.

Binary __TEXT is page-aligned so I don't have a "N bytes smaller" number to show — the change is too small to cross a page. The IR-level elimination is the real measurable effect.

Given that LLVM 22 already fixes this upstream, I'm fine with either direction: land this as a defensive source-level guarantee on a hot formatting path, or close it and let 1.95 + LLVM 22 resolve the user impact. If you'd prefer to wait for LLVM 22, no objection from me — the motivation is genuinely weaker than it looked when I first reproduced the bug on stable. Your call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic_bounds_check isn't optimized away in impl fmt::Display for integers (regression between 1.92 and 1.93)

4 participants