Skip to content

Commit e977d03

Browse files
committed
Separate model and feature support matrices by category (#1100)
Signed-off-by: Teresa Chen <boe20211@gmail.com>
1 parent 75fbb1d commit e977d03

36 files changed

+793
-138
lines changed

.buildkite/README.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,9 @@ To support this requirement, each model and feature will go through a series of
2222
# Adding a new model to CI
2323
## Adding a TPU-optimized model
2424
TPU-optimized models are models we rewrite the model definition as opposed to using the model definition from the vLLM upstream. These models will go through benchmark on top of unit and integration (accuracy) tests. To add a TPU-optimized model to CI, model owners can use the prepared [add_model_to_ci.py](pipeline_generation/add_model_to_ci.py) script. The script will populate a buildkite yaml config file in the `.buildkite/models` directory; config files under this directory will be integrated to our pipeline automatically. The python script takes 2 arguments:
25-
- **model_name**: this is the **full name** of your model on Hugging Face. Please ensure to use the **full name** (ex: `meta-llama/Llama-3.1-8B` instead of `Llama-3.1-8B`) or else we won't be able to find your model.
26-
- **queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
25+
- **--model-name**: this is the **full name** of your model on Hugging Face. Please ensure to use the **full name** (ex: `meta-llama/Llama-3.1-8B` instead of `Llama-3.1-8B`) or else we won't be able to find your model.
26+
- **--queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
27+
- **--category**: this parameter allows you to set the model category, with the following options available: "text-only" or "multimodel".
2728

2829
```bash
2930
python add_model_to_ci.py --model-name <MODEL_NAME> --queue <QUEUE_NAME>
@@ -36,8 +37,9 @@ In the generated yml file, there are three TODOs that will need your input:
3637

3738
## Adding a vLLM-native model
3839
vLLM-native models are models using the model definition from the vLLM upstream. These models will not go through benchmark on our pipeline. To add a vLLM-native model to CI, model owners can use the prepared [add_model_to_ci.py](pipeline_generation/add_model_to_ci.py) script. The script will populate a buildkite yaml config file in the `.buildkite/models` directory; config files under this directory will be integrated to our pipeline automatically. The python script takes 3 arguments:
39-
- **model_name**: this is the **full name** of your model on Hugging Face. Please ensure to use the **full name** (ex: `meta-llama/Llama-3.1-8B` instead of `Llama-3.1-8B`) or else we won't be able to find your model.
40-
- **queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
40+
- **--model-name**: this is the **full name** of your model on Hugging Face. Please ensure to use the **full name** (ex: `meta-llama/Llama-3.1-8B` instead of `Llama-3.1-8B`) or else we won't be able to find your model.
41+
- **--queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
42+
- **--category**: this parameter allows you to set the model category, with the following options available: "text-only" or "multimodel".
4143

4244
```bash
4345
python add_model_to_ci.py --model-name <MODEL_NAME> --queue <QUEUE_NAME> --type vllm-native
@@ -49,8 +51,9 @@ In the generated yml file, there are two TODOs that will need your input:
4951

5052
# Adding a new feature to CI
5153
To add a new feature to CI, feature owners can use the prepared [add_feature_to_ci.py](pipeline_generation/add_feature_to_ci.py) script. The script will populate a buildkite yaml config file in the `.buildkite/features` directory; config files under this directory will be integrated to our pipeline automatically. The python script takes 2 arguments:
52-
- **feature_name**: this is the name of your feature
53-
- **queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
54+
- **--feature-name**: this is the name of your feature
55+
- **--queue**: this is the queue you want to run on (ex: `tpu_v6e_queue`)
56+
- **--category**: this parameter allows you to set the feature category, with the following options available: "feature support matrix" or "kernel support matrix".
5457

5558
```bash
5659
python add_feature_to_ci.py --feature-name <FEATURE_NAME> --queue <QUEUE_NAME>

.buildkite/features/Collective_Communication_Matmul.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Collective Communication Matmul
2+
# kernel support matrix
23
steps:
34
- label: "Correctness tests for Collective Communication Matmul"
45
key: "Collective_Communication_Matmul_CorrectnessTest"
@@ -13,6 +14,7 @@ steps:
1314
env:
1415
CI_TARGET: "Collective Communication Matmul"
1516
CI_STAGE: "CorrectnessTest"
17+
CI_CATEGORY: "kernel support matrix"
1618
agents:
1719
queue: cpu
1820
commands:

.buildkite/features/JAX-Path_Qxix_Quantization.yml

Lines changed: 0 additions & 42 deletions
This file was deleted.

.buildkite/features/Multimodal_Inputs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Multimodal Inputs
2+
# feature support matrix
23
steps:
34
- label: "Correctness tests for Multimodal Inputs"
45
key: "Multimodal_Inputs_CorrectnessTest"
@@ -13,6 +14,7 @@ steps:
1314
env:
1415
CI_TARGET: Multimodal Inputs
1516
CI_STAGE: "CorrectnessTest"
17+
CI_CATEGORY: "feature support matrix"
1618
agents:
1719
queue: cpu
1820
commands:
@@ -33,6 +35,7 @@ steps:
3335
env:
3436
CI_TARGET: Multimodal Inputs
3537
CI_STAGE: "PerformanceTest"
38+
CI_CATEGORY: "feature support matrix"
3639
agents:
3740
queue: cpu
3841
commands:

.buildkite/features/Quantized_Matmul_Attention_and_KV_Cache.yml

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,30 @@
11
# Quantized Matmul Attention and KV Cache
2+
# kernel support matrix
23
steps:
3-
# - label: "Correctness tests for Quantized Matmul Attention and KV Cache"
4-
# key: "Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
5-
# soft_fail: true
6-
# agents:
7-
# queue: cpu
8-
# commands:
9-
# - echo "covered by performance test"
10-
# - label: "Record correctness test result for Quantized Matmul Attention and KV Cache"
11-
# key: "record_Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
12-
# depends_on: "Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
13-
# env:
14-
# CI_TARGET: "Quantized Matmul Attention and KV Cache"
15-
# CI_STAGE: "CorrectnessTest"
16-
# agents:
17-
# queue: cpu
18-
# commands:
19-
# - |
20-
# .buildkite/scripts/record_step_result.sh Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest
4+
- label: "Correctness tests for Quantized Matmul Attention and KV Cache"
5+
key: "Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
6+
soft_fail: true
7+
agents:
8+
queue: cpu
9+
commands:
10+
- |
11+
buildkite-agent meta-data set "Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest" "to be added"
12+
- label: "Record correctness test result for Quantized Matmul Attention and KV Cache"
13+
key: "record_Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
14+
depends_on: "Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
15+
env:
16+
CI_TARGET: "Quantized Matmul Attention and KV Cache"
17+
CI_STAGE: "CorrectnessTest"
18+
CI_CATEGORY: "kernel support matrix"
19+
agents:
20+
queue: cpu
21+
commands:
22+
- |
23+
.buildkite/scripts/record_step_result.sh Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest
2124
2225
- label: "Performance tests for Quantized Matmul Attention and KV Cache"
2326
key: "Quantized_Matmul_Attention_and_KV_Cache_PerformanceTest"
24-
# depends_on: "record_Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
27+
depends_on: "record_Quantized_Matmul_Attention_and_KV_Cache_CorrectnessTest"
2528
soft_fail: true
2629
agents:
2730
queue: tpu_v6e_8_queue
@@ -44,6 +47,7 @@ steps:
4447
env:
4548
CI_TARGET: "Quantized Matmul Attention and KV Cache"
4649
CI_STAGE: "PerformanceTest"
50+
CI_CATEGORY: "kernel support matrix"
4751
agents:
4852
queue: cpu
4953
commands:

.buildkite/features/Speculative_Decoding-_Ngram.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Speculative Decoding: Ngram
2+
# feature support matrix
23
steps:
34
- label: "Correctness tests for Speculative Decoding: Ngram"
45
key: "Speculative_Decoding-_Ngram_CorrectnessTest"
@@ -17,6 +18,7 @@ steps:
1718
env:
1819
CI_TARGET: "Speculative Decoding: Ngram"
1920
CI_STAGE: "CorrectnessTest"
21+
CI_CATEGORY: "feature support matrix"
2022
agents:
2123
queue: cpu
2224
commands:
@@ -42,6 +44,7 @@ steps:
4244
env:
4345
CI_TARGET: "Speculative Decoding: Ngram"
4446
CI_STAGE: "PerformanceTest"
47+
CI_CATEGORY: "feature support matrix"
4548
agents:
4649
queue: cpu
4750
commands:

.buildkite/features/Structured_Decoding.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Structured Decoding
2+
# feature support matrix
23
steps:
34
- label: "Correctness tests for Structured Decoding"
45
key: "Structured_Decoding_CorrectnessTest"
@@ -13,6 +14,7 @@ steps:
1314
env:
1415
CI_TARGET: Structured Decoding
1516
CI_STAGE: "CorrectnessTest"
17+
CI_CATEGORY: "feature support matrix"
1618
agents:
1719
queue: cpu
1820
commands:

.buildkite/features/async_scheduler.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# async scheduler
2+
# feature support matrix
23
steps:
34
- label: "Correctness tests for async scheduler"
45
key: "async_scheduler_CorrectnessTest"
@@ -13,6 +14,7 @@ steps:
1314
env:
1415
CI_TARGET: "async scheduler"
1516
CI_STAGE: "CorrectnessTest"
17+
CI_CATEGORY: "feature support matrix"
1618
agents:
1719
queue: cpu
1820
commands:
@@ -33,6 +35,7 @@ steps:
3335
env:
3436
CI_TARGET: "async scheduler"
3537
CI_STAGE: "PerformanceTest"
38+
CI_CATEGORY: "feature support matrix"
3639
agents:
3740
queue: cpu
3841
commands:
Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Chunked Prefill
2-
Prefix Caching
3-
Ragged Paged Attention V3
4-
Single Program Multi Data
1+
Chunked Prefill (feature support matrix)
2+
Prefix Caching (feature support matrix)
3+
Ragged Paged Attention V3 (kernel support matrix)
4+
Single Program Multi Data (feature support matrix)

.buildkite/models/Qwen_Qwen2_5-VL-7B-Instruct.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Qwen/Qwen2.5-VL-7B-Instruct
2+
# multimodel
23
steps:
34
- label: "Unit tests for Qwen/Qwen2.5-VL-7B-Instruct"
45
key: "Qwen_Qwen2_5-VL-7B-Instruct_UnitTest"
@@ -13,8 +14,9 @@ steps:
1314
key: "record_Qwen_Qwen2_5-VL-7B-Instruct_UnitTest"
1415
depends_on: "Qwen_Qwen2_5-VL-7B-Instruct_UnitTest"
1516
env:
16-
CI_STAGE: "UnitTest"
1717
CI_TARGET: Qwen/Qwen2.5-VL-7B-Instruct
18+
CI_STAGE: "UnitTest"
19+
CI_CATEGORY: "multimodel"
1820
agents:
1921
queue: cpu
2022
commands:
@@ -40,6 +42,7 @@ steps:
4042
env:
4143
CI_TARGET: Qwen/Qwen2.5-VL-7B-Instruct
4244
CI_STAGE: "IntegrationTest"
45+
CI_CATEGORY: "multimodel"
4346
agents:
4447
queue: cpu
4548
commands:
@@ -61,6 +64,7 @@ steps:
6164
env:
6265
CI_TARGET: Qwen/Qwen2.5-VL-7B-Instruct
6366
CI_STAGE: "Benchmark"
67+
CI_CATEGORY: "multimodel"
6468
agents:
6569
queue: cpu
6670
commands:

0 commit comments

Comments
 (0)