docs(gpu): add ROCm 7.x and RDNA 3.5 / Strix Halo (gfx1151) to GPU acceleration guide#9229
docs(gpu): add ROCm 7.x and RDNA 3.5 / Strix Halo (gfx1151) to GPU acceleration guide#9229walcz-de wants to merge 2 commits intomudler:masterfrom
Conversation
- Fix typo: "deditated" → "dedicated", "ROCm6" → "ROCm" - Add ROCm 7.x to requirements (alongside ROCm 6.x) - Add Ubuntu 24.04 to tested OS list - Add AMD Strix Halo / gfx1151 section with kernel params, required env vars (HSA_OVERRIDE_GFX_VERSION, ROCBLAS_USE_HIPBLASLT), and Docker Compose example - Add gfx1151 to the list of compiled GPU targets - Add ROCm version column to verified devices table - Add gfx1151 / Radeon 8060S (ROCm 7.11.0) as verified device
…warning - Add all 4 required env vars (HSA_OVERRIDE_GFX_VERSION, ROCBLAS_USE_HIPBLASLT, HSA_XNACK=1, HSA_ENABLE_SDMA=0) with descriptions in a table - Fix Docker Compose example to use the ROCm 7.x image tag (-gpu-hipblas-rocm7), not the ROCm 6.x image - Add explicit warning: GGML_CUDA_ENABLE_UNIFIED_MEMORY must NOT be set (even =0 activates hipMallocManaged due to getenv != nullptr check) - Add --force-recreate note (docker restart does not update container env) - Add tested hardware note (Geekom A9 Mega / Ryzen AI MAX+ 395)
|
Context for reviewers This is the documentation companion to #9230. On the kernel boot parameters On the environment variables On the |
Summary
AMDGPU_TARGETSdefault (11 rocWMMA-compatible architectures) and when to override itBackground
AMD Ryzen AI MAX+ (Strix Halo) APUs with an integrated Radeon 8060S (gfx1151 / RDNA 3.5) expose up to 96 GB of unified VRAM to the GPU, but require ROCm 7.x (not available in ROCm 6.x) and two runtime env vars to work correctly:
The companion build PR (feat(rocm) #9230) explains the AMDGPU_TARGETS default in detail: the default covers the 11 GPU architectures supported by the rocWMMA library, which is required for
-DGGML_HIP_ROCWMMA_FATTN=ON(~50% FlashAttention speedup). GPUs outside that list (gfx803, gfx900, gfx906, gfx1012, gfx1030–1032, gfx1103, gfx1152) can still use ROCm 7.x but require a custom build with the full arch list and without the rocWMMA optimisation.Test plan
🤖 Generated with Claude Code