Commit 15b29be
authored
[Transform] Spinquant R3 (#1778)
## Purpose ##
* Support R3 for spinquant (without attention quantization, but possibly
with kv_cache quantization)
## Prerequisites ##
* #1651
* #1918
* vllm-project/compressed-tensors#485
## Changes ##
* Break out head dim inference
* Add attention target for spinquant mappings
## Testing ##
* Tested model coherence with llama 8b model
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>1 parent 71cd6ae commit 15b29be
File tree
2 files changed
+25
-24
lines changed- src/llmcompressor/modifiers/transform/spinquant
2 files changed
+25
-24
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| 132 | + | |
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
135 | 136 | | |
136 | 137 | | |
137 | 138 | | |
138 | | - | |
| 139 | + | |
139 | 140 | | |
140 | 141 | | |
141 | | - | |
| 142 | + | |
142 | 143 | | |
143 | 144 | | |
144 | 145 | | |
| |||
223 | 224 | | |
224 | 225 | | |
225 | 226 | | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
| 227 | + | |
244 | 228 | | |
245 | 229 | | |
246 | 230 | | |
| |||
257 | 241 | | |
258 | 242 | | |
259 | 243 | | |
260 | | - | |
261 | | - | |
262 | | - | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
263 | 261 | | |
264 | 262 | | |
265 | 263 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| 33 | + | |
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
| |||
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
| 55 | + | |
53 | 56 | | |
54 | 57 | | |
55 | 58 | | |
| |||
0 commit comments