Add opus C++ template library documentation and unit tests#2004
Add opus C++ template library documentation and unit tests#2004
Conversation
5ae2b1f to
c39df07
Compare
There was a problem hiding this comment.
Pull request overview
Adds a comprehensive Markdown guide documenting the opus single-header C++ template library (used by AITER HIP kernels) with sections covering types, compile-time constants, containers, layouts, gmem/smem access, MFMA/tiled MMA, distributed tensor views, and utilities.
Changes:
- Introduces a new
docs/opus_guide.mddocumentation guide foropus. - Adds usage examples and API references for gmem/smem, MFMA, tiled MMA, and partition/layout utilities.
- Documents how Opus is used in several AITER kernels and helper utilities.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/opus_guide.md
Outdated
|
|
||
| // Compile-time constants | ||
| auto n = 42_I; // number<42> | ||
| auto s = seq<2, 4, 8>{}; // Compile-time integer sequence |
There was a problem hiding this comment.
In the Quick Reference snippet, auto s is first used for seq<2, 4, 8>{} but later reused for make_smem(...) in the same code block, which makes the snippet fail to compile as-written. Rename one of the variables or split these examples into separate scopes/snippets so readers can copy/paste reliably.
| auto s = seq<2, 4, 8>{}; // Compile-time integer sequence | |
| auto seq_vals = seq<2, 4, 8>{}; // Compile-time integer sequence |
docs/opus_guide.md
Outdated
| auto s = make_smem(smem); | ||
| auto val = s.load<4>(offset); |
There was a problem hiding this comment.
In the Quick Reference snippet, val is declared earlier as fp16_t val; and then redeclared here as auto val = s.load<4>(offset);, which will not compile in a single scope. Use distinct names (e.g., val0 / smem_val) or separate the examples into different snippets/scopes.
| auto s = make_smem(smem); | |
| auto val = s.load<4>(offset); | |
| auto smem_view = make_smem(smem); | |
| auto smem_val = smem_view.load<4>(offset); |
docs/opus_guide.md
Outdated
| | Value | Constant | Meaning | | ||
| |-------|----------|---------| | ||
| | 0 | `RT` | Default (return temporal) | | ||
| | 3 | `GROUP_NT` | Group non-temporal — hints that data won't be reused | |
There was a problem hiding this comment.
The aux section references constants RT and GROUP_NT, but these identifiers are not defined in opus/opus.hpp (they appear to be defined in csrc/include/aiter_opus_plus.h under namespace aiter). Consider either removing the constant-name column, qualifying them as aiter::RT / aiter::GROUP_NT, or documenting aux purely as a numeric immediate passed to the underlying buffer intrinsic.
| | Value | Constant | Meaning | | |
| |-------|----------|---------| | |
| | 0 | `RT` | Default (return temporal) | | |
| | 3 | `GROUP_NT` | Group non-temporal — hints that data won't be reused | | |
| | Value | Meaning | | |
| |-------|---------| | |
| | 0 | Default (temporal) load/store behavior | | |
| | 3 | Group non-temporal — hints that data won't be reused | |
docs/opus_guide.md
Outdated
| | `i16_t` / `u16_t` | `int16_t` / `uint16_t` | 16-bit integers | | ||
| | `i8_t` / `u8_t` | `int8_t` / `uint8_t` | 8-bit integers | | ||
|
|
There was a problem hiding this comment.
The scalar-type table implies u16_t is always available, but in opus.hpp it is only registered when __clang_major__ >= 20 (ROCm 7+). Please note this conditional availability (or avoid listing u16_t as a guaranteed type) so users on older toolchains don’t get compile errors.
| | `i16_t` / `u16_t` | `int16_t` / `uint16_t` | 16-bit integers | | |
| | `i8_t` / `u8_t` | `int8_t` / `uint8_t` | 8-bit integers | | |
| | `i16_t` | `int16_t` | 16-bit signed integer | | |
| | `u16_t`* | `uint16_t` | 16-bit unsigned integer (only when `__clang_major__ >= 20`, e.g. ROCm 7+) | | |
| | `i8_t` / `u8_t` | `int8_t` / `uint8_t` | 8-bit integers | | |
| \* `u16_t` is conditionally registered in `opus.hpp` and is only available on toolchains with `__clang_major__ >= 20` (ROCm 7+). On older ROCm/Clang versions, `u16_t` is not provided. |
c39df07 to
b0f5fbc
Compare
|
#2017 => refactored here :) |
- Documentation guide for the opus micro STD library - 16 standalone test groups compiled with hipcc (no GTest dependency) - Test coverage: number, seq, array, tuple, vector, slice, layout, static_for, type_traits, underscore, embed, packed_types, adaptor, mfma_types, warp_size, functional
GPU tests covering: type conversions (fp16/bf16/fp8), math ops (max/min/med3), DPP warp operations (mov_dpp/upd_dpp), GMEM buffer load/store (scalar + vec), SMEM/LDS load/store (scalar + vec), MFMA intrinsics (16x16x16, 32x32x8, accumulator chaining, scaled values), mfma_adaptor device-side shapes/layouts, swap_ab adaptor, tiled MMA (2x2x1 expansion), FP8 pack/unpack, container folding, and s_waitcnt synchronization.
- Fix duplicate variable names in Quick Reference snippets (s, val) - Clarify aux template parameter constants (RT/GROUP_NT from aiter_opus_plus.h) - Note u16_t conditional availability (ROCm 7+ / clang >= 20)
382c53e to
7307973
Compare
Summary
csrc/include/opus/opus.hpp)Test plan
hipcc -std=c++20 -O2)bash csrc/include/opus/run_test.sh