Skip to content

slice::contains runtime doubles when an unrelated binary_search benchmark is present #150450

@Gourab-Ghosh

Description

@Gourab-Ghosh

Summary

On my machine, the runtime of slice::contains in a tight loop changes drastically (~2×) depending on whether an additional, unrelated benchmark using slice::binary_search is present in the same binary. The contains benchmark is executed before the binary_search benchmark, but still becomes much slower when the third benchmark is compiled/executed afterwards.

This looks like a codegen/layout/inlining interaction (or similar), not a source-level change to the contains call.


Reproduction

Command

cargo run --release

Repro code (src/main.rs)

#![allow(unused)]

use PieceType::*;
use std::hint::black_box;
use std::time::Instant;

#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy)]
enum PieceType {
    Pawn,
    Knight,
    Bishop,
    Rook,
    Queen,
    King,
}

macro_rules! benchmark {
    ($code:block, $num_iterations:expr $(,)? ) => {{
        let now = Instant::now();
        for _ in 0..$num_iterations {
            black_box($code);
        }
        println!("Time taken to run: {:?}", now.elapsed());
    }};
}

fn main() {
    let num_iterations: usize = 25_000_000_000;

    benchmark!(
        { matches!(Pawn, Knight | Bishop | Rook | Queen) },
        num_iterations,
    );
    benchmark!(
        { const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn) },
        num_iterations,
    );
    benchmark!(
        {
            const { [Knight, Bishop, Rook, Queen] }
                .binary_search(&Pawn)
                .is_ok()
        },
        num_iterations,
    );
}

Observed output (with binary_search benchmark present)

Time taken to run: 4.700495676s
Time taken to run: 10.036953544s
Time taken to run: 4.750905003s

Control case

Comment out the last benchmark!( ... binary_search ... ) block (no other changes), then run the same command.

Observed output (without binary_search benchmark)

Time taken to run: 4.819309506s
Time taken to run: 4.75536901s

Expected

The second benchmark (const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn)) should have comparable runtime regardless of whether a later, unrelated benchmark is present.

Actual

contains becomes ~2× slower only when the binary_search benchmark exists in the same program.


Environment

Hardware (Lenovo Legion Pro 7i Gen 10, 2025)

  • CPU: Intel Core Ultra 9 275HX
  • RAM: 64 GB DDR5-6400
  • GPU: NVIDIA GeForce RTX 5070 Laptop GPU

Software

  • OS: Garuda Linux (Arch-based), x86_64
  • Rust Version: rustc 1.92.0 (ded5c06 2025-12-08) (Arch Linux rust 1:1.92.0-1)
  • Build/run: cargo run --release

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions