dlasd2/slasd2: Increase deflation tolerance to match dlasd7/slasd7 by jschueller · Pull Request #1286 · Reference-LAPACK/lapack

jschueller · 2026-06-11T13:59:53Z

Fix DBDSDC/SBDSDC returning non-orthogonal U/V for bidiagonal matrices with many nearly-equal singular values (e.g., all singular values ≈ 1).

The divide-and-conquer bidiagonal SVD has two code paths:

Full vector path (DLASD2/SLASD2): deflation tolerance = 8 * EPS
Compact path (DLASD7/SLASD7): deflation tolerance = 64 * EPS

With the weaker tolerance (8EPS), singular values differing by only ~10EPS (e.g., ~2e-15 for double precision) escape deflation. The subsequent Z computation in DLASD3 then suffers catastrophic cancellation from denominators (σ_i - σ_j) that are tiny, polluting the singular vectors and causing loss of orthogonality.

Raise DLASD2 and SLASD2 to 64*EPS, matching DLASD7 and SLASD7, so more close singular values are deflated and the singular-vector computation remains stable.

Closes #255

Fix DBDSDC/SBDSDC returning non-orthogonal U/V for bidiagonal matrices with many nearly-equal singular values (e.g., all singular values ≈ 1). The divide-and-conquer bidiagonal SVD has two code paths: - Full vector path (DLASD2/SLASD2): deflation tolerance = 8 * EPS - Compact path (DLASD7/SLASD7): deflation tolerance = 64 * EPS With the weaker tolerance (8*EPS), singular values differing by only ~10*EPS (e.g., ~2e-15 for double precision) escape deflation. The subsequent Z computation in DLASD3 then suffers catastrophic cancellation from denominators (σ_i - σ_j) that are tiny, polluting the singular vectors and causing loss of orthogonality. Raise DLASD2 and SLASD2 to 64*EPS, matching DLASD7 and SLASD7, so more close singular values are deflated and the singular-vector computation remains stable. Closes Reference-LAPACK#255

… #1286) Raise the deflation tolerance from 8*eps to 64*eps so near-equal singular values are deflated, preventing loss of orthogonality in the divide-and-conquer bidiagonal SVD. Port of Reference-LAPACK 0e0f2359 (Reference-LAPACK/lapack#1286).

martin-frbg · 2026-06-27T17:55:10Z

One of OpenBLAS' codspeed benchmarks in CI is flagging a large change in the result of an SGESDD call (via comparing the original to a reconstruction of the source matrix from its computed diagonal) with

E       Max absolute difference among violations: 2.336502e-05
E       Max relative difference among violations: 0.01933236

where our thresholds are 1e-5 for absolute and 1.e-7 for relative difference - note the relative difference hitting 1.e-2 - the CI job terminated there before trying DGESDD, but its results would probably be similarly "off".

Is this magnitude of difference to previous LAPACK behaviour expected for the SVD of a "benign" 1000-by-222 matrix of random numbers ? Note the python code for the benchmark lives at https://github.com/OpenMathLib/OpenBLAS/blob/develop/benchmark/pybench/

jschueller · 2026-06-27T19:31:39Z

I ran reference LAPACK SGESDD on a 1000x222 random matrix (uniform [0,1)) using both tolerance values (8*EPS and 64*EPS) to compare:

1. The tolerance change makes no difference for well-conditioned random matrices.
Both old and new tolerances produced identical singular values and reconstruction error on the same matrix. This is expected — the deflation tolerance only matters when singular values are very close (difference < 64*EPS), which rarely happens for random uniform matrices with condition number ~2.7.

2. The atol=1e-5 threshold is too tight for SGESDD on 1000x222 in single precision.
I tested 50 different random seeds. All 50 fail assert_allclose(u @ diag(s) @ vt, a, atol=1e-5) — the max absolute reconstruction error is routinely 2–3e-5 regardless of the tolerance setting. For single precision (EPS ≈ 6e-8), the D&C SVD error bound scales as O(N * EPS) * ||A||, which for N=222 gives an expected error floor around 222 * 6e-8 ≈ 1.3e-5. The benchmark passes for (100, 5) but is borderline for (1000, 222).

3. The 1.9% "relative difference" is from elements near zero.
2.3e-5 / 0.019 ≈ 0.0012 — the worst relative error occurs for matrix elements where |a_ij| ~ 0.001. This is benign floating-point behavior for single-precision SVD.

4. The tolerance change fixes a real catastrophic bug (issue #255).
With 8*EPS, matrices having many nearly-equal singular values (e.g., all ≈ 1) produce U with orthogonality error ≈ 1 (complete failure). The fix raises this to 64*EPS, matching the compact path (DLASD7/SLASD7) that already uses this tolerance and has worked correctly.

The CodSpeed benchmark's atol={'s': 1e-5} is likely too tight for this matrix size in single precision. The 1.9% "change" in the CodSpeed metric is normal numerical churn — the tolerance change does not regress accuracy; it only affects matrices with near-equal singular values where the old code catastrophically failed. The benchmark threshold may need to be relaxed (e.g., atol={'s': 1e-4}) or the (1000, 222) case may need rtol alongside atol.

mgates3 · 2026-06-30T15:12:35Z

Should a similar change be made in the eigenvalue D&C code?
It appears to have a slightly different formula, including Z.

SRC> grep 'TOL *=' dlaed*.f  # eig
dlaed2.f:      TOL = EIGHT*EPS*MAX( ABS( D( JMAX ) ), ABS( Z( IMAX ) ) )
dlaed8.f:      TOL = EIGHT*EPS*ABS( D( JMAX ) )

SRC> grep 'TOL *=' dlasd*.f  # svd, updated by this PR
dlasd2.f:      TOL = EIGHT*EIGHT*EPS*MAX( ABS( D( N ) ), TOL )
dlasd7.f:      TOL = EIGHT*EIGHT*EPS*MAX( ABS( D( N ) ), TOL )

(Output edited for brevity.)

jschueller · 2026-06-30T20:50:59Z

good catch: #1317

langou approved these changes Jun 13, 2026

View reviewed changes

langou merged commit fbd4146 into Reference-LAPACK:master Jun 13, 2026
12 checks passed

jschueller deleted the issue255 branch June 13, 2026 10:04

ilayn mentioned this pull request Jun 21, 2026

MAINT:FIX: Pull in changes from Reference LAPACK 2026Q2 ilayn/semicolon-lapack#18

Merged

rgommers mentioned this pull request Jun 27, 2026

Increase tolerance in ?LASD2 to improve ?BDSDC stability (Reference-LAPACK PR 1286) OpenMathLib/OpenBLAS#5858

Merged

jschueller mentioned this pull request Jun 30, 2026

{d/s}laed2/8: Increase deflation tolerance to match {d/s}lasd2/7 #1317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dlasd2/slasd2: Increase deflation tolerance to match dlasd7/slasd7#1286

dlasd2/slasd2: Increase deflation tolerance to match dlasd7/slasd7#1286
langou merged 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue255

jschueller commented Jun 11, 2026

Uh oh!

Uh oh!

martin-frbg commented Jun 27, 2026

Uh oh!

jschueller commented Jun 27, 2026

Uh oh!

mgates3 commented Jun 30, 2026

Uh oh!

jschueller commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

jschueller commented Jun 11, 2026

Uh oh!

Uh oh!

martin-frbg commented Jun 27, 2026

Uh oh!

jschueller commented Jun 27, 2026

Uh oh!

mgates3 commented Jun 30, 2026

Uh oh!

jschueller commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants