Skip to content

ENH: Bring CI workflow current — ITK 5.4.6, macos-15, OpenCL ICD 2025.07.22#74

Open
hjmjohnson wants to merge 7 commits into
enh/vkfft-backend-5-metalfrom
ci/itk-5.4.6
Open

ENH: Bring CI workflow current — ITK 5.4.6, macos-15, OpenCL ICD 2025.07.22#74
hjmjohnson wants to merge 7 commits into
enh/vkfft-backend-5-metalfrom
ci/itk-5.4.6

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson commented May 20, 2026

Bring .github/workflows/build-test-package.yml current. Targets the PR #73 branch as base so the workflow update lands on the same series as the v5.4.6 backend work.

Windows and macOS Python wheel jobs are currently if: false — blocked on upstream InsightSoftwareConsortium/ITKPythonBuilds#3 (ITKPythonBuilds-windows.zip AND ITKPythonBuilds-macosx-arm64.tar.zst bake CMAKE_CXX_STANDARD=14 into the castxml wrapping config; ITK 6 source uses C++17 CTAD that castxml-with-c++14 cannot parse). Re-enable here once a new ITKPythonBuilds release ships with CMAKE_CXX_STANDARD=17.

Pinned version bumps
Variable Before After
itk-git-tag v5.3.0 v5.4.6
itk-wheel-tag v5.3.0 v5.4.6
opencl-icd-loader-git-tag v2021.04.29 v2025.07.22
opencl-headers-git-tag v2021.04.29 v2025.07.22
macOS runner macos-14 macos-15
actions/download-artifact v2 v4 (required for upload-artifact@v4 interop)
lukka/get-cmake v3.22.2 @latest

itk-python-package-tag is left at its existing pinned commit; bumping it requires verifying the wheel-build scripts against ITK 5.4.6 first and is out of scope here.

Python wheel matrix narrowed
Job Before After
build-linux-opencl-python-packages ["37","38","39","310","311"] ["310","311"]
build-windows-opencl-python-packages ["9","10","11"] ["10","11"]

Drops Python 3.7 / 3.8 / 3.9 wheel production. Aligns with pyproject.toml requires-python = ">=3.10". Python 3.9 remains as the C++ build job's pip-for-ninja interpreter (incidental — not exercised by wheels).

Cross-platform OpenCL availability

The previous Install pocl step ran sudo conda install -c conda-forge pocl. GitHub-hosted Ubuntu and macOS images no longer ship Conda by default; the step failed on macos-14/macos-15 and on recent ubuntu images. Replace with native package managers:

  • Linux (ubuntu-24.04): sudo apt-get install -y pocl-opencl-icd ocl-icd-opencl-dev clinfo
  • macOS (macos-15): brew install pocl
  • Windows (windows-2022): unchanged — builds the ICD loader from source, no CPU OpenCL needed.
Stacking

This PR is built on top of PR #73 (Level Zero + Metal backends, CUDA 13 fix, multi-ICD OpenCL fix, LZ runtime probe). The branch is ci/itk-5.4.6 on hjmjohnson; its base is enh/vkfft-backend-5-metal (PR #73's head). Merge PR #73 first; this PR will then rebase cleanly onto upstream main.

@hjmjohnson hjmjohnson force-pushed the ci/itk-5.4.6 branch 16 times, most recently from 71074f0 to 1804179 Compare May 21, 2026 16:17
…_BACKEND

Default VKFFT_BACKEND to OpenCL in itkVkDefinitions.h when it is not
defined on the command line, so castxml wrapping invocations do not take
vkFFT.h's Vulkan branch and fail on a missing <vulkan/vulkan.h>.

Skip OpenCL platforms that report no devices: querying device count with
num_entries=0 on such a platform (e.g. Apple's deprecated OpenCL
framework on macOS 15) returns CL_DEVICE_NOT_FOUND, which must be
treated as 'try the next platform' rather than a hard failure.
itk-module-init.cmake searched only default system paths for ze_loader
and the headers, missing a loader installed to a non-system prefix.
Honor LEVEL_ZERO_ROOT/CMPLR_ROOT with include and lib path suffixes,
matching the top-level CMakeLists.txt.

Add the level_zero/ subdirectory to the include path: VkFFT includes
<ze_api.h> bare while this module uses <level_zero/ze_api.h>, so both
the prefix and its level_zero/ subdir must be reachable.
Update the build/test/package workflow to ITK v5.4.6, the macos-15
runner, OpenCL ICD loader/headers v2025.07.22, Python 3.9, and current
actions/* versions. Build only the ITK modules VkFFTBackend depends on
(ITK_BUILD_DEFAULT_MODULES=OFF plus the declared DEPENDS/COMPILE_DEPENDS/
TEST_DEPENDS) via a shared itk-minimal-modules variable. Point the
freshly built ICD loader at conda's pocl vendor file and restrict the
hosted ctest run to the non-FFT smoke tests (pocl on CPU diverges from
real-GPU VkFFT kernels). Disable the Python wheel jobs pending an
ITKPythonBuilds C++17 wrapping fix. Resolve the dockcross OpenCL loader
library by glob so the wheel script tracks the ICD loader version.
Test GPU and Notebook tests request [self-hosted, gpu] and
[self-hosted, notebook-gpu] runners that are not online for this repo,
so they queued indefinitely on every push/PR. Trigger them only on
workflow_dispatch or when a PR carries the 'gpu-ci' label.
…ackends

The hosted build-cxx leg compiles only VKFFT_BACKEND=3 (OpenCL). Add a
build-backend job that configures and compiles each remaining
module-supported backend (no GPU needed to compile):

  - CUDA (1)       ubuntu-24.04, CUDA toolkit + libcufft + driver stub
  - Level Zero (4) ubuntu-24.04, ze_loader built from source, installed
                   to /usr so VkFFT's hardcoded header/library search and
                   bare ze_loader link resolve
  - Metal (5)      macos-15, builds with BUILD_TESTING=ON and runs ctest;
                   the Apple Silicon runner executes the FFTs on its GPU

ITK is built with only the module's dependency set. HIP (2) is omitted:
the module has no VKFFT_BACKEND=2 CMake branch.
Make CL_TARGET_OPENCL_VERSION a cache variable defaulting to 300 and use
it in the compile definition, so the value the CI workflows pass is
honored instead of a hardcoded 120. Bump the build-test-package and
test-gpu workflows to opencl-version 300. Verified the OpenCL backend
compiles and links with no new deprecation warnings.
pocl (CPU OpenCL) computes VkFFT's size-19 Bluestein inverse incorrectly,
so the hosted OpenCL legs ran only the lint smoke tests. Give the round-trip
tests (ForwardInverse, ForwardInverse1D, HalfHermitian) an optional maxSize
argument (default 20, unchanged for GPU runners) and register PoclSafe
variants capped at size 16 — radix-2/3/5/7 plus Bluestein primes 11/13,
clear of pocl's prime-19 weakness.

The ubuntu leg (conda pocl, verified) now selects 'VkFFTBackend|PoclSafe',
adding genuine FFT correctness coverage on pocl. windows (no OpenCL ICD) and
macos (pocl unverified locally) stay lint-only. Full-size and baseline FFT
correctness still run on the GPU runner.

Verified locally: the capped variants pass under pocl 7.1 (conda-forge) and
on a real GPU; the full-size variants still pass on the GPU and continue to
fail under pocl at size 19.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant