Skip to content

Commit a8180ee

Browse files
committed
[test] qwen3 moe w4a16 + skip
Summary This test would ordinarily take too long so we only quantize the first 10 layers Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>
1 parent db0b68d commit a8180ee

File tree

2 files changed

+29
-0
lines changed

2 files changed

+29
-0
lines changed
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: Qwen/Qwen3-30B-A3B
4+
dataset_id: HuggingFaceH4/ultrachat_200k
5+
dataset_split: train_sft
6+
scheme: W4A16_group
7+
num_calibration_samples: 20
8+
save_dir: "Qwen3-30B-A3B-W4A16-first-10"
9+
recipe: tests/e2e/vLLM/recipes/WNA16/recipe_w4a16_group_quant_first_10_layers.yaml
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
quant_stage:
2+
quant_modifiers:
3+
GPTQModifier:
4+
ignore: [
5+
"lm_head",
6+
# Ignore layers (10+)
7+
"re:.*model\\.layers\\.([1-9][0-9])\\..*",
8+
]
9+
actorder: null
10+
config_groups:
11+
group_0:
12+
weights:
13+
num_bits: 4
14+
type: "int"
15+
symmetric: True
16+
strategy: "group"
17+
group_size: 128
18+
input_activations: null
19+
output_activations: null
20+
targets: ["Linear"]

0 commit comments

Comments
 (0)