Skip to content

Commit d18b3ed

Browse files
authored
Added tuned kernel block size for customer's model (#1071)
Signed-off-by: cychiuak <andersonchiu@google.com>
1 parent b2c7446 commit d18b3ed

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

tpu_inference/kernels/ragged_paged_attention/v3/tuned_block_sizes.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1231,6 +1231,13 @@
12311231
},
12321232
}
12331233
},
1234+
16: {
1235+
'q_bfloat16_kv_bfloat16': {
1236+
'q_head-8_kv_head-1_head-128': {
1237+
262144: (128, 256),
1238+
}
1239+
}
1240+
},
12341241
},
12351242
'TPU v5e': {
12361243
128: {

0 commit comments

Comments
 (0)