Skip to content

The BLAS/LAPACK surgery#229

Merged
Bruno02468 merged 12 commits into
MystranSolver:devfrom
Bruno02468:system_lapack_fix
May 28, 2026
Merged

The BLAS/LAPACK surgery#229
Bruno02468 merged 12 commits into
MystranSolver:devfrom
Bruno02468:system_lapack_fix

Conversation

@Bruno02468
Copy link
Copy Markdown
Collaborator

@Bruno02468 Bruno02468 commented May 28, 2026

The BLAS/LAPACK surgery

Here's the situation: until now, MYSTRAN had, within its source tree, somewhat-modified versions of the reference (Netlib) BLAS and LAPACK implementations. This caused some issues:

  1. When attempting to link dynamically against system BLAS/LAPACK (as the SuperLU build script can request), symbol interposition replaces some MYSTRAN-embedded BLAS/LAPACK functions with modified signatures, resulting in stack violations and (hopefully) crashes.
  2. When attempting to link statically against system BLAS/LAPACK, the performance gains are minimal since only SuperLU can see the symbols at the linking step. If you change the linking order to force MYSTRAN itself to see the system BLAS, you'll get symbol naming conflicts.
  3. Merely linking against the BLAS subroutines is not enough, because LAPACK also matters for performance.
  4. OpenBLAS specifically has better performance if we set its threads to 1.

So... here are the steps I've taken. I've called this whole effect the "BLAS/LAPACK surgery" due to the need to extricate something very intrisic.

Phase 1: map out the incompatibilities.

Merely deleting the BLAS and LAPACK folders from the source tree and linking against, say OpenBLAS, wasn't enough.

We had modded functions with different signatures for convenience/logging, and some inconsequential changes to the math.

After mapping out these mods, I was ready to begin the "surgery".

Phase 2: update imports and move mods out.

Whereas BLAS functions had no actual math modifications, some LAPACK ones did. These mods were moved to the MYSTRAN_LAPACK_EXT module.

Also, LAPACK imports were all changed to take into account the functions all coming from "outside" of the current source tree.

I also had to restore some EXTERNAL declarations in the embedded ARPACK module that were commented out. That was breaking buckling problems. Major props to @victorkemp for helping me figure out what was going on!

Phase 3: move submodules and modify the build script.

We got too many submodules. This was true before I even added the Netlib lapack submodule (also includes BLAS). I moved all submodules (and patches directories) to the submodules dir.

Once that was done, the build script now has the MYSTRAN_BLAS_LAPACK input variable, which can be EMBEDDED (builds the Netlib reference implementations from the submodules), SYSTEM (locates a system-provided BLAS/LAPACK implementation, like OpenBLAS, Intel MKL, Apple Accelerate, APL, etc.), or AUTO. The other enable_internal_blaslib variable is now deprecated.

Since we'll always link against some BLAS, that means SuperLU will never ever need to link against its CBLAS. Therefore, f2c can be fully removed from the build script, it is no longer a dependency.

Also, I added the licenses for the most common system BLAS implementations I could find, so they can be statically linked against without violating the licenses' "must include a copy of this copyright notice" clause. The missing ones, like APL and Apple Accelerate, are the ones I couldn't find honest-to-god license TXTs for. They say TXTs are included in the local installation, so I call upon Arm and Apple uses to add the licenses. If in doubt, ping me on Discord, and I'll handle it myself. All I'll need is the license text file.

Bonus: Handle OpenBLAS threads

I also added a SET_BLAS_THREADS subroutine that can call the OpenBLAS function to force the threads to be 1. If we find out that other BLAS providers also perform better with different numbers of threads, we can add more macros to that file so we have better control of the number of threads. For now, the only one I could add support for was OpenBLAS.


Okay, so all that's done. The end-result is: MYSTRAN now only uses fully-standard BLAS and LAPACK calls, allowing us to link against any BLAS/LAPACK-providing library, such as Apple Accelerate, Intel MKL, Arm Performance Libraries, etc. This also prevents future issues in case we ever decide to link against other libraries.

And, of course, when we link against system-provided BLAS, we get better performance. That's very much expected.

@Bruno02468 Bruno02468 merged commit 9fddbdc into MystranSolver:dev May 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant