-
Notifications
You must be signed in to change notification settings - Fork 179
[AMD] Add DeepSeek-R1-0528 FP8 MI355X ATOM MTP3 benchmark #1628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
76a6073
621f897
a251a0d
4a32700
c1e8e6f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,23 +24,22 @@ PORT=${PORT:-8888} | |
|
|
||
| export OMP_NUM_THREADS=1 | ||
|
|
||
| # Calculate max-model-len based on ISL and OSL | ||
| if [ "$ISL" = "1024" ] && [ "$OSL" = "1024" ]; then | ||
| CALCULATED_MAX_MODEL_LEN="" | ||
| else | ||
| CALCULATED_MAX_MODEL_LEN=" --max-model-len 10240 " | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 8192 ISL missing max-model-lenMedium Severity The script no longer sets Reviewed by Cursor Bugbot for commit 4a32700. Configure here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 8k1k missing max model lenMedium Severity The ISL/OSL branch that passed Reviewed by Cursor Bugbot for commit c1e8e6f. Configure here. |
||
| fi | ||
|
|
||
| CALCULATED_MAX_MODEL_LEN="" | ||
| if [ "${EVAL_ONLY}" = "true" ]; then | ||
| setup_eval_context | ||
| CALCULATED_MAX_MODEL_LEN=" --max-model-len $EVAL_MAX_MODEL_LEN " | ||
| fi | ||
|
|
||
| if [ "$EP_SIZE" -gt 1 ]; then | ||
| EP=" --enable-expert-parallel" | ||
| else | ||
| EP=" " | ||
| fi | ||
| PARALLEL_ARGS=(-tp "$TP") #TP | ||
| if [ "$DP_ATTENTION" = "true" ]; then | ||
| if [ "$EP_SIZE" -gt 1 ]; then #DP+EP | ||
| PARALLEL_ARGS=(-tp "$TP" --enable-expert-parallel --enable-dp-attention ) | ||
| else #DP+TP | ||
| PARALLEL_ARGS=(-tp "$TP" --enable-dp-attention ) | ||
| fi | ||
| fi | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Expert parallel ignores EP_SIZE aloneMedium Severity The script only adds Reviewed by Cursor Bugbot for commit a251a0d. Configure here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. EP ignored without DP attentionMedium Severity
Reviewed by Cursor Bugbot for commit 4a32700. Configure here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Expert parallel ignores EP_SIZEMedium Severity
Reviewed by Cursor Bugbot for commit c1e8e6f. Configure here. |
||
|
|
||
| SPEC_ARGS=(--method mtp --num-speculative-tokens 3 ) | ||
|
|
||
| # Start GPU monitoring (power, temperature, clocks every second) | ||
| start_gpu_monitor | ||
|
|
@@ -50,10 +49,9 @@ set -x | |
| python3 -m atom.entrypoints.openai_server \ | ||
| --model $MODEL \ | ||
| --server-port $PORT \ | ||
| -tp $TP \ | ||
| --kv_cache_dtype fp8 $CALCULATED_MAX_MODEL_LEN $EP \ | ||
| --method mtp \ | ||
|
seungrokj marked this conversation as resolved.
|
||
| --num-speculative-tokens 3 \ | ||
| "${PARALLEL_ARGS[@]}" \ | ||
| "${SPEC_ARGS[@]}" \ | ||
| --kv_cache_dtype fp8 $CALCULATED_MAX_MODEL_LEN \ | ||
| > $SERVER_LOG 2>&1 & | ||
|
|
||
| SERVER_PID=$! | ||
|
|
||


Uh oh!
There was an error while loading. Please reload this page.