Update Benchstones (BenchI & BenchF) to validate results by abdulrahmanhossam · Pull Request #125383 · dotnet/runtime

abdulrahmanhossam · 2026-03-10T13:37:46Z

This PR completes the correctness validation for the BenchI (Integer) and BenchF (Floating Point) suites. Previously, many of these benchmarks only measured performance while silently returning true. This update ensures that the JIT compiler produces mathematically correct results without affecting the benchmark's performance profile by using O(1) post-loop checks.

Key Changes:

1. BenchI (Integer) Suite

AddArray / AddArray2: Validated final accumulated indices and values, including intentional 32-bit overflows (e.g., verifying that a[200][200] resolves correctly after overflow).
MulMatrix: Validated the 900jk mathematical derivation for the result matrix.
NDhrystone: Verified global state consistency (Iterations + 10).
IniArray: Added post-loop check to verify successful array initialization.

2. BenchF (Floating Point) Suite

Root Finding (Bisect, NewtR, Regula, Secant, NewtE): Added validation for the final roots of the equations. For example, verifying the Plastic Ratio ($\approx 1.32471$) for the equation $x^3 - x - 1 = 0$.
Integration (Romber, Simpsn, Trap): Verified the definite integral results for $e^{-x^2}$ and $e^{-2x}$ against their analytical values.
Linear Algebra (InvMt, MatInv4, SqMtx): Added checks for matrix inversion accuracy by verifying the resulting Identity Matrix and Bit Error Rate (BER).
Chaos & Signal Processing (Lorenz, FFT, DMath): Validated final attractor coordinates for Lorenz and frequency domain components (DC offset) for FFT.
Whetstone: Added a precision check for the circular Exp(Log(x)) transformations in Module 11 to ensure identity stability.

Technical Implementation Details:

Zero Overhead: All validations are performed strictly outside the timed benchmark loops to ensure no impact on JIT code quality measurements.
Floating Point Tolerance: Applied appropriate epsilon values (ranging from 1e-5 to 1e-10) to account for precision differences across different hardware architectures (x64, ARM64).
Error Handling: Updated goto error labels (e.g., in MatInv4) to return false upon numerical failure instead of a silent true, ensuring test reliability.
Audited & Verified: Audited the entire directory; files already containing internal verification (like Adams, InProd, and LLoops) were kept as-is to maintain original logic.

Checklist:

All O(1) checks are outside the performance-critical loops.
Verified that the benchmarks still pass under normal conditions.
No new dependencies added.

EgorBo · 2026-03-10T15:04:27Z

To be honest, I don't think these tests add much value (besides checking JIT asserts). It's fine to add a validation, but this PR does a lot of absolutely unnecessary indention changes + adds magic numbers like

We prefer first-time contributors to focus on rather simpler/cleaner changes.

abdulrahmanhossam · 2026-03-10T15:23:52Z

Hi @EgorBo,

Thanks for the feedback. Regarding the indentation changes, these were automatically applied by the IDE's formatter on save to adhere to standard C# styles. I understand this makes the diff noisier than intended, so I will revert the unnecessary styling changes and keep the focus strictly on the validation logic.

As for the "magic numbers," these are the mathematically expected reference values for these specific algorithms (e.g., the Lorenz attractor's steady state or the Plastic Ratio for root-finding). While these tests do catch JIT asserts, adding correctness validation ensures that the JIT isn't just "running" the code, but generating correct numerical output—which is the core purpose of #5049.

I will clean up the PR by:

Reverting all unnecessary whitespace/indentation changes.
Replacing "magic numbers" with descriptive constants or adding comments to explain the origin of these reference values.

I believe even for a first-time contributor, ensuring the correctness of these long-standing benchmarks is a valuable goal. Updated commits coming soon.

…ers with descriptive constants/boundaries

abdulrahmanhossam · 2026-03-10T21:39:05Z

Closing this for now, might revisit later

Add O(1) correctness validation to missing BenchI benchmarks

33dde44

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 10, 2026

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 10, 2026

Add O(1) correctness validation to BenchF suite

dbcee23

abdulrahmanhossam added 2 commits March 10, 2026 23:34

Address code review: Revert formatting changes and replace magic numb…

7f3fee6

…ers with descriptive constants/boundaries

Merge branch 'main' into feature/issue-5049-bench-validation

6ea4236

abdulrahmanhossam closed this Mar 10, 2026

abdulrahmanhossam deleted the feature/issue-5049-bench-validation branch March 10, 2026 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Benchstones (BenchI & BenchF) to validate results #125383

Update Benchstones (BenchI & BenchF) to validate results #125383
abdulrahmanhossam wants to merge 4 commits intodotnet:mainfrom
abdulrahmanhossam:feature/issue-5049-bench-validation

abdulrahmanhossam commented Mar 10, 2026 •

edited

Loading

Uh oh!

EgorBo commented Mar 10, 2026

Uh oh!

abdulrahmanhossam commented Mar 10, 2026

Uh oh!

abdulrahmanhossam commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abdulrahmanhossam commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes:

1. BenchI (Integer) Suite

2. BenchF (Floating Point) Suite

Technical Implementation Details:

Checklist:

Uh oh!

EgorBo commented Mar 10, 2026

Uh oh!

abdulrahmanhossam commented Mar 10, 2026

Uh oh!

abdulrahmanhossam commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abdulrahmanhossam commented Mar 10, 2026 •

edited

Loading