Commit abadbd0
Add support of 64 headDim
Summary: This diff adds support for 64 head dimension in the Blackwell Decode attention algorithm. The code changes include a dispatch macro for head dimension and a test case for the new head dimension. The test case is skipped for known numerical precision issues with FP8 and head_dim=64 in GQA mode.
Differential Revision: D867744871 parent 648e57a commit abadbd0
File tree
2 files changed
+41
-5
lines changed- fbgemm_gpu/experimental/gen_ai
- src/attention/cuda/cutlass_blackwell_fmha
- test/attention
2 files changed
+41
-5
lines changedLines changed: 36 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
293 | 324 | | |
294 | 325 | | |
295 | 326 | | |
| |||
300 | 331 | | |
301 | 332 | | |
302 | 333 | | |
| 334 | + | |
303 | 335 | | |
304 | 336 | | |
305 | 337 | | |
306 | | - | |
307 | | - | |
308 | | - | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
309 | 342 | | |
310 | 343 | | |
311 | 344 | | |
312 | 345 | | |
313 | | - | |
314 | 346 | | |
315 | 347 | | |
316 | 348 | | |
| |||
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
685 | 685 | | |
686 | 686 | | |
687 | 687 | | |
688 | | - | |
| 688 | + | |
689 | 689 | | |
690 | 690 | | |
691 | 691 | | |
| |||
712 | 712 | | |
713 | 713 | | |
714 | 714 | | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
715 | 719 | | |
716 | 720 | | |
717 | 721 | | |
| |||
0 commit comments