Skip to content

Conversation

@mayeut
Copy link
Contributor

@mayeut mayeut commented Nov 22, 2025

  • I updated the package version in pyproject.toml and made sure the first 3 numbers match git describe --tags --abbrev=8 in OpenBLAS at the OPENBLAS_COMMIT. If I did not update OPENBLAS_COMMIT, I incremented the wheel build number (i.e. 0.3.29.0.0 to 0.3.29.0.1)

Builds on top of #230

The use of clang instead of gcc allows:

  • to get a very recent compiler that supports recent SIMD extensions without having to wait for a new gcc-toolset or update the manylinux base image.
  • to get faster builds when using QEMU

The clang install script might end-up included in manylinux images (see pypa/manylinux#1871) and has been copied directly from https://github.com/scikit-build/ninja-python-distributions/blob/master/scripts/install-static-clang.sh for now.

@mayeut
Copy link
Contributor Author

mayeut commented Nov 23, 2025

I though the fork tests hang would disappear after an OpenBlas update (looked like the issue mentioned in #229) but there are still random deadlocks in the fork test under QEMU (wether it's a QEMU one or just the fact that running QEMU increases the chance of an existing race condition to happen is yet to be determined).

It seems that aarch64 runners are much faster than x86_64 (for this workload) with QEMU builds going down from 1 hour to 40 minutes.

@mattip
Copy link
Collaborator

mattip commented Nov 23, 2025

I though the fork tests hang would disappear

One of the ppc64le runs succeeds, the other fails. The failed run prints

2025-11-23T08:09:19.4979912Z TEST 122/127 zgemv:2_0_nan_1_inf_1_incy_2 [OK]
2025-11-23T08:09:19.5031453Z TEST 123/127 potrf:bug_695 [OK]
2025-11-23T08:09:19.5115065Z TEST 124/127 potrf:smoketest_trivial [OK]
2025-11-23T08:09:19.7959236Z TEST 125/127 kernel_regress:skx_avx [OK]
2025-11-23T08:35:46.7868483Z ##[error]The action has timed out.

The successful run prints

2025-11-23T07:58:44.4121838Z TEST 122/127 zgemv:2_0_nan_1_inf_1_incy_2 [OK]
2025-11-23T07:58:44.4172979Z TEST 123/127 potrf:bug_695 [OK]
2025-11-23T07:58:44.4255760Z TEST 124/127 potrf:smoketest_trivial [OK]
2025-11-23T07:58:44.7111789Z TEST 125/127 kernel_regress:skx_avx [OK]
2025-11-23T07:59:10.0551932Z TEST 126/127 fork:safety [OK]
2025-11-23T07:59:10.0740217Z TEST 127/127 fork:safety_after_fork_in_parent [OK]

which suggests the problem is in fork:safety.

The test itself is the one from the scipy issue which is also the test in #229. I will try to debug it in a qemu docker container.

@mattip
Copy link
Collaborator

mattip commented Nov 23, 2025

Another problem: It seems this compiled shared object from the wheels-macos-latest-arm64-1-macosx- artifact suffers from the same segfault from issue #233 when testing the zladiv interface. Did something change in the way gfortran exports functions?

@mayeut
Copy link
Contributor Author

mayeut commented Nov 23, 2025

It seems this compiled shared object from the wheels-macos-latest-arm64-1-macosx- artifact suffers from the same segfault from issue #233 when testing the zladiv interface. Did something change in the way gfortran exports functions?

This PR does not touch the macOS build except for the OpenBLAS update which only has a limited diff compared to what's in main, the only thing related to fortran is OpenMathLib/OpenBLAS#5540 which seems right. Does main passes (or the current nightly build which uses the latest develop) ?

As a side note, this PR still uses gfortran on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants