Skip to content

Add M5 Max 40 GPU (128 GB) benchmark data and visualization#97

Open
Chida82 wants to merge 2 commits into
antirez:mainfrom
Chida82:benchM5max
Open

Add M5 Max 40 GPU (128 GB) benchmark data and visualization#97
Chida82 wants to merge 2 commits into
antirez:mainfrom
Chida82:benchM5max

Conversation

@Chida82
Copy link
Copy Markdown

@Chida82 Chida82 commented May 12, 2026

Thanks for the awesome work. Here is the benchmark for my M5 Max with CPU 18-Core and GPU 40-Core.

Hardware: Apple M5 Max with CPU 18-Core and GPU 40-Core, 128 GB unified memory. Metal backend, IQ2XXS w2Q2K imatrix GGUF.

Ran the command from speed-bench/README.md:

./ds4-bench \ -m ds4flash.gguf \ --prompt-file speed-bench/promessi_sposi.txt \ --ctx-start 2048 \ --ctx-max 65536 \ --step-incr 2048 \ --gen-tokens 128

Then:

python3 speed-bench/plot_speed.py speed-bench/m5_max_40_gpu.csv --title "M5 Max 40 Gpu t/s"

other info:

ds4-bench: context buffers 1311.89 MiB (ctx=65665, backend=metal, prefill_chunk=2048, raw_kv_rows=2304, compressed_kv_rows=16418)
ds4: Metal device Apple M5 Max, 128.00 GiB RAM
ds4: requesting Metal residency (may take tens of seconds)... done
ds4: warming Metal model views... done
ds4: Metal model views created in 2.354 ms, residency requested in 277.150 ms, warmup 4.174 ms (mapped 82697.67 MiB from offset 5.08 MiB)
ds4: Metal mapped mmaped model as 2 overlapping shared buffers
ds4: metal backend initialized for graph diagnostics

@kaxap
Copy link
Copy Markdown

kaxap commented May 12, 2026

Was your laptop connected to a power outlet? My numbers are a bit higher (+2 TPS):

ctx_tokens,prefill_tokens,prefill_tps,gen_tokens,gen_tps,kvcache_bytes
2048,2048,375.87,128,31.32,52184460
4096,2048,339.83,128,30.95,80373132
6144,2048,327.62,128,30.91,108561804
8192,2048,311.13,128,27.62,136750476
10240,2048,302.70,128,25.06,164939148
12288,2048,311.74,128,24.94,193127820
14336,2048,315.33,128,24.78,221316492
16384,2048,316.74,128,24.77,249505164
18432,2048,307.98,128,24.79,277693836
20480,2048,304.57,128,26.49,305882508
22528,2048,289.75,128,29.63,334071180
24576,2048,279.28,128,29.63,362259852
26624,2048,272.07,128,29.18,390448524
28672,2048,267.91,128,29.12,418637196
30720,2048,264.20,128,28.83,446825868
32768,2048,261.81,128,28.78,475014540
34816,2048,256.72,128,28.24,503203212
36864,2048,254.37,128,28.27,531391884
38912,2048,253.33,128,28.08,559580556
40960,2048,248.62,128,28.23,587769228
43008,2048,247.22,128,27.73,615957900
45056,2048,244.05,128,27.89,644146572
47104,2048,242.79,128,27.84,672335244
49152,2048,240.79,128,27.87,700523916
51200,2048,233.38,128,27.57,728712588
53248,2048,241.35,128,27.46,756901260
55296,2048,235.96,128,27.34,785089932
57344,2048,240.54,128,26.84,813278604
59392,2048,231.79,128,27.11,841467276
61440,2048,228.84,128,27.25,869655948
63488,2048,229.56,128,26.94,897844620
65536,2048,227.26,128,26.81,926033292

@Chida82
Copy link
Copy Markdown
Author

Chida82 commented May 12, 2026

yes, but now I have a second run with high power selected
image

so there is 2 different situation:

  • Automatic
  • High Power

ctx_tokens,prefill_tokens,prefill_tps,gen_tokens,gen_tps,kvcache_bytes
2048,2048,372.15,128,31.48,52184460
4096,2048,337.42,128,31.10,80373132
6144,2048,333.76,128,30.77,108561804
8192,2048,330.69,128,30.79,136750476
10240,2048,322.36,128,30.46,164939148
12288,2048,310.95,128,30.45,193127820
14336,2048,310.05,128,30.15,221316492
16384,2048,312.23,128,30.06,249505164
18432,2048,308.68,128,29.69,277693836
20480,2048,306.42,128,29.84,305882508
22528,2048,302.78,128,29.55,334071180
24576,2048,300.89,128,29.55,362259852
26624,2048,295.56,128,29.20,390448524
28672,2048,292.72,128,29.27,418637196
30720,2048,289.74,128,28.96,446825868
32768,2048,287.67,128,28.93,475014540
34816,2048,282.43,128,28.67,503203212
36864,2048,280.37,128,28.62,531391884
38912,2048,277.36,128,28.47,559580556
40960,2048,275.66,128,28.48,587769228
43008,2048,271.20,128,28.22,615957900
45056,2048,268.62,128,28.21,644146572
47104,2048,265.80,128,27.98,672335244
49152,2048,263.89,128,28.05,700523916
51200,2048,260.43,128,27.81,728712588
53248,2048,258.63,128,27.83,756901260
55296,2048,255.69,128,27.46,785089932
57344,2048,254.13,128,27.55,813278604
59392,2048,250.31,128,27.31,841467276
61440,2048,248.56,128,27.33,869655948
63488,2048,245.97,128,27.08,897844620
65536,2048,244.73,128,26.97,926033292

@Chida82 Chida82 changed the title Add M5 Max 40 GPU benchmark data and visualization Add M5 Max 40 GPU (128 GB) benchmark data and visualization May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants