Commit 6efcd65
authored
vulkan: optimize flash attention split_k_reduce (#14554)
* vulkan: allow FA split_k with smaller KV values
* vulkan: spread split_k_reduce work across more threads
k_num can get rather large. Use the whole workgroup to reduce the M/L values.
Launch a thread for each element in the HSV dimension of the output. Helps a
lot for large HSV (like deepseek).1 parent 699f439 commit 6efcd65
File tree
2 files changed
+42
-12
lines changed- ggml/src/ggml-vulkan
- vulkan-shaders
2 files changed
+42
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2706 | 2706 | | |
2707 | 2707 | | |
2708 | 2708 | | |
2709 | | - | |
| 2709 | + | |
2710 | 2710 | | |
2711 | 2711 | | |
2712 | 2712 | | |
| |||
6252 | 6252 | | |
6253 | 6253 | | |
6254 | 6254 | | |
6255 | | - | |
| 6255 | + | |
6256 | 6256 | | |
6257 | 6257 | | |
6258 | 6258 | | |
6259 | 6259 | | |
6260 | 6260 | | |
6261 | | - | |
| 6261 | + | |
6262 | 6262 | | |
6263 | 6263 | | |
6264 | 6264 | | |
| |||
6392 | 6392 | | |
6393 | 6393 | | |
6394 | 6394 | | |
6395 | | - | |
| 6395 | + | |
6396 | 6396 | | |
6397 | 6397 | | |
6398 | 6398 | | |
| |||
Lines changed: 38 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
35 | | - | |
36 | | - | |
| 37 | + | |
| 38 | + | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
40 | 56 | | |
41 | 57 | | |
42 | | - | |
43 | | - | |
44 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
45 | 61 | | |
46 | 62 | | |
47 | 63 | | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
48 | 76 | | |
49 | 77 | | |
| 78 | + | |
| 79 | + | |
50 | 80 | | |
51 | | - | |
| 81 | + | |
52 | 82 | | |
53 | 83 | | |
54 | 84 | | |
| |||
0 commit comments