Skip to content

Cap iteration count in ScalarReplacementOfAggregates#155290

Open
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-sroa-hang-153205
Open

Cap iteration count in ScalarReplacementOfAggregates#155290
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-sroa-hang-153205

Conversation

@blackms
Copy link
Copy Markdown

@blackms blackms commented Apr 14, 2026

Fixes #153205.

Problem

ScalarReplacementOfAggregates (compiler/rustc_mir_transform/src/sroa.rs) runs a fixed-point loop { … } that keeps flattening aggregate locals until no further "dead" locals are produced. Inside, compute_flattening calls iter_fields, which normalizes each field type via try_normalize_erasing_regions before feeding it back as a new local.

On the MCVE from the issue:

trait Apply { type Output<T>; }
struct Identity;
impl Apply for Identity { type Output<T> = T; }
struct Thing<A: Apply>(A::Output<Self>);

fn foo<A: Apply>() { let _x: Thing<A>; }
fn main() { foo::<Identity>(); }

normalizing Thing<Identity>'s field type yields Thing<Identity> again — the enclosing struct itself. An E0391 layout-cycle error is emitted up front, but MIR optimization still runs, and every SROA iteration allocates exactly one fresh local of type Thing<Identity> which is itself a fresh flattening candidate. all_dead_locals is therefore non-empty forever, and the loop never terminates.

Reproduced locally with rustc 1.94.1 stable: after the E0391 is printed the compiler hangs indefinitely (killed after 12 s during repro).

Fix

Replace the unbounded loop with for _ in 0..MAX_ITERATIONS where const MAX_ITERATIONS: usize = 16. Per @scottmcm on the issue thread, 2 iterations typically capture all of SROA's value and non-contrived code never exceeds 10, so 16 leaves comfortable headroom while guaranteeing termination on tainted input. The loop body (including the break on empty all_dead_locals) is unchanged, so healthy code reaches its fixed point exactly as before.

I considered adding an early-bail on body.tainted_by_errors / tcx.dcx().has_errors() instead, but opted against it: it's a larger behavioural change (skipping SROA entirely on tainted bodies) and the same non-termination shape could theoretically be reached without the tainting flag being set at pass entry. A structural iteration cap is a simpler, more local invariant and is robust to both.

Test

Added tests/ui/layout/hang-sroa-layout-cycle.rs next to the existing layout-cycle.rs / post-mono-layout-cycle.rs tests. Uses //@ build-fail with //@ compile-flags: -Copt-level=3 (SROA is gated on mir_opt_level >= 2) and asserts the E0391 cycle error via //~^ ERROR. Without the fix compiletest times out on this test; with the fix it completes and matches the expected stderr.

r? mir-opt

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 14, 2026

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 14, 2026
@Kivooeo
Copy link
Copy Markdown
Member

Kivooeo commented Apr 14, 2026

  1. how llm was used for this pr?

  2. am i read this correct?

  In practice, 2
   // iterations capture nearly all the value and more than 10 is never
   // needed for non-contrived code.

it says that 2 is nearly covers all cases and more than 10 is never needed, so you go 16? why?

@rust-log-analyzer

This comment has been minimized.

The SROA pass repeatedly flattens aggregate locals into their field
locals, looping until a fixed point is reached. When the MIR body
involves a type whose layout is cyclic (e.g. `struct T(T)` reached via
an associated type projection), the layout cycle error is emitted but
compilation continues into MIR optimization. Inside SROA, `iter_fields`
then keeps returning a single field whose normalized type is the
enclosing struct itself, so each iteration allocates a fresh replacement
local and the fixed-point loop never terminates.

Bound the loop to a small constant. In practice two iterations capture
essentially all of SROA's value and non-contrived code does not exceed
ten, so 10 is sufficient and guarantees termination on tainted input.
@blackms blackms force-pushed the fix-sroa-hang-153205 branch from aa5cf73 to e6d8c3c Compare April 14, 2026 14:48
@blackms
Copy link
Copy Markdown
Author

blackms commented Apr 14, 2026

Thanks for the review @Kivooeo!

how llm was used for this pr?

Full disclosure: I used a Claude-based coding agent team for the initial investigation — reading the SROA pass, forming the root-cause hypothesis, drafting the fix from @scottmcm's comment, and sketching the test. I then validated manually before opening the PR: reproduced the hang locally with rustc 1.94.1 stable on the MCVE (killed after 12 s after the E0391), read the full diff myself, and checked the test placement against sibling files (layout-cycle.rs, post-mono-layout-cycle.rs). Happy to answer any question about the reasoning or rewrite anything if something reads as machine-y.

it says that 2 is nearly covers all cases and more than 10 is never needed, so you go 16? why?

Fair point, I was being overcautious. Dropped the cap to 10 in the force-push just now — that's the number @scottmcm explicitly called out as "never needed" in non-contrived code, so using exactly that value makes the relationship between comment and constant obvious. Comment updated to match.

Also force-pushed a fix for the CI failure: when I renamed the test from hang-sroa-layout-cycle-153205.rs to hang-sroa-layout-cycle.rs I forgot to update the $DIR/… path and line number in the .stderr fixture (the .rs file gained a blank line from the URL rewrite). That's what aarch64-gnu-llvm-21-1 was complaining about; should be green on the new push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler hangs with uninitialized variable containing a struct that contains itself

5 participants