Add M5 Max 40 GPU (128 GB) benchmark data and visualization by Chida82 · Pull Request #97 · antirez/ds4

Chida82 · 2026-05-12T13:47:56Z

Thanks for the awesome work. Here is the benchmark for my M5 Max with CPU 18-Core and GPU 40-Core.

Hardware: Apple M5 Max with CPU 18-Core and GPU 40-Core, 128 GB unified memory. Metal backend, IQ2XXS w2Q2K imatrix GGUF.

Ran the command from speed-bench/README.md:

./ds4-bench \ -m ds4flash.gguf \ --prompt-file speed-bench/promessi_sposi.txt \ --ctx-start 2048 \ --ctx-max 65536 \ --step-incr 2048 \ --gen-tokens 128

Then:

python3 speed-bench/plot_speed.py speed-bench/m5_max_40_gpu.csv --title "M5 Max 40 Gpu t/s"

other info:

ds4-bench: context buffers 1311.89 MiB (ctx=65665, backend=metal, prefill_chunk=2048, raw_kv_rows=2304, compressed_kv_rows=16418)
ds4: Metal device Apple M5 Max, 128.00 GiB RAM
ds4: requesting Metal residency (may take tens of seconds)... done
ds4: warming Metal model views... done
ds4: Metal model views created in 2.354 ms, residency requested in 277.150 ms, warmup 4.174 ms (mapped 82697.67 MiB from offset 5.08 MiB)
ds4: Metal mapped mmaped model as 2 overlapping shared buffers
ds4: metal backend initialized for graph diagnostics

kaxap · 2026-05-12T13:55:42Z

Was your laptop connected to a power outlet? My numbers are a bit higher (+2 TPS):

ctx_tokens,prefill_tokens,prefill_tps,gen_tokens,gen_tps,kvcache_bytes
2048,2048,375.87,128,31.32,52184460
4096,2048,339.83,128,30.95,80373132
6144,2048,327.62,128,30.91,108561804
8192,2048,311.13,128,27.62,136750476
10240,2048,302.70,128,25.06,164939148
12288,2048,311.74,128,24.94,193127820
14336,2048,315.33,128,24.78,221316492
16384,2048,316.74,128,24.77,249505164
18432,2048,307.98,128,24.79,277693836
20480,2048,304.57,128,26.49,305882508
22528,2048,289.75,128,29.63,334071180
24576,2048,279.28,128,29.63,362259852
26624,2048,272.07,128,29.18,390448524
28672,2048,267.91,128,29.12,418637196
30720,2048,264.20,128,28.83,446825868
32768,2048,261.81,128,28.78,475014540
34816,2048,256.72,128,28.24,503203212
36864,2048,254.37,128,28.27,531391884
38912,2048,253.33,128,28.08,559580556
40960,2048,248.62,128,28.23,587769228
43008,2048,247.22,128,27.73,615957900
45056,2048,244.05,128,27.89,644146572
47104,2048,242.79,128,27.84,672335244
49152,2048,240.79,128,27.87,700523916
51200,2048,233.38,128,27.57,728712588
53248,2048,241.35,128,27.46,756901260
55296,2048,235.96,128,27.34,785089932
57344,2048,240.54,128,26.84,813278604
59392,2048,231.79,128,27.11,841467276
61440,2048,228.84,128,27.25,869655948
63488,2048,229.56,128,26.94,897844620
65536,2048,227.26,128,26.81,926033292

Chida82 · 2026-05-12T14:14:31Z

yes, but now I have a second run with high power selected

so there is 2 different situation:

Automatic
High Power

ctx_tokens,prefill_tokens,prefill_tps,gen_tokens,gen_tps,kvcache_bytes
2048,2048,372.15,128,31.48,52184460
4096,2048,337.42,128,31.10,80373132
6144,2048,333.76,128,30.77,108561804
8192,2048,330.69,128,30.79,136750476
10240,2048,322.36,128,30.46,164939148
12288,2048,310.95,128,30.45,193127820
14336,2048,310.05,128,30.15,221316492
16384,2048,312.23,128,30.06,249505164
18432,2048,308.68,128,29.69,277693836
20480,2048,306.42,128,29.84,305882508
22528,2048,302.78,128,29.55,334071180
24576,2048,300.89,128,29.55,362259852
26624,2048,295.56,128,29.20,390448524
28672,2048,292.72,128,29.27,418637196
30720,2048,289.74,128,28.96,446825868
32768,2048,287.67,128,28.93,475014540
34816,2048,282.43,128,28.67,503203212
36864,2048,280.37,128,28.62,531391884
38912,2048,277.36,128,28.47,559580556
40960,2048,275.66,128,28.48,587769228
43008,2048,271.20,128,28.22,615957900
45056,2048,268.62,128,28.21,644146572
47104,2048,265.80,128,27.98,672335244
49152,2048,263.89,128,28.05,700523916
51200,2048,260.43,128,27.81,728712588
53248,2048,258.63,128,27.83,756901260
55296,2048,255.69,128,27.46,785089932
57344,2048,254.13,128,27.55,813278604
59392,2048,250.31,128,27.31,841467276
61440,2048,248.56,128,27.33,869655948
63488,2048,245.97,128,27.08,897844620
65536,2048,244.73,128,26.97,926033292

Add M5 Max 40 GPU benchmark data and visualization

0314bc5

Add M5 Max 40 GPU benchmark data and visualization files

e524688

Chida82 changed the title ~~Add M5 Max 40 GPU benchmark data and visualization~~ Add M5 Max 40 GPU (128 GB) benchmark data and visualization May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add M5 Max 40 GPU (128 GB) benchmark data and visualization#97

Add M5 Max 40 GPU (128 GB) benchmark data and visualization#97
Chida82 wants to merge 2 commits into
antirez:mainfrom
Chida82:benchM5max

Chida82 commented May 12, 2026

Uh oh!

kaxap commented May 12, 2026

Uh oh!

Chida82 commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chida82 commented May 12, 2026

Uh oh!

kaxap commented May 12, 2026

Uh oh!

Chida82 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chida82 commented May 12, 2026 •

edited

Loading