Commit 954a2cf
Add support of 64 headDim (#5114)
Summary:
Pull Request resolved: #5114
X-link: https://github.com/facebookresearch/FBGEMM/pull/2120
This diff adds support for 64 head dimension in the Blackwell Decode attention algorithm. The code changes include a dispatch macro for head dimension and a test case for the new head dimension. The test case is skipped for known numerical precision issues with FP8 and head_dim=64 in GQA mode.
Reviewed By: jaewonlee-fb, jianyuh
Differential Revision: D86774487
fbshipit-source-id: 6583ee3d2f337702c01fc32fd3b9ddfd0b02c29b1 parent 03cdbd8 commit 954a2cf
File tree
2 files changed
+41
-4
lines changed- fbgemm_gpu/experimental/gen_ai
- src/attention/cuda/cutlass_blackwell_fmha
- test/attention
2 files changed
+41
-4
lines changedLines changed: 36 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
293 | 324 | | |
294 | 325 | | |
295 | 326 | | |
| |||
300 | 331 | | |
301 | 332 | | |
302 | 333 | | |
| 334 | + | |
303 | 335 | | |
304 | 336 | | |
305 | 337 | | |
306 | | - | |
307 | | - | |
308 | | - | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
309 | 342 | | |
310 | 343 | | |
311 | 344 | | |
| |||
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
693 | 693 | | |
694 | 694 | | |
695 | 695 | | |
696 | | - | |
| 696 | + | |
697 | 697 | | |
698 | 698 | | |
699 | 699 | | |
| |||
720 | 720 | | |
721 | 721 | | |
722 | 722 | | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
723 | 727 | | |
724 | 728 | | |
725 | 729 | | |
| |||
0 commit comments