Skip to content

Add MacBook Pro M5 Max (128 GB) Metal throughput report#103

Open
lixiangnlp wants to merge 1 commit into
antirez:mainfrom
lixiangnlp:bench/m5-max-128gb-report
Open

Add MacBook Pro M5 Max (128 GB) Metal throughput report#103
lixiangnlp wants to merge 1 commit into
antirez:mainfrom
lixiangnlp:bench/m5-max-128gb-report

Conversation

@lixiangnlp
Copy link
Copy Markdown

Summary

Adds a five-run ds4-bench performance report for MacBook Pro Apple M5 Max,
128 GB
(Metal, IQ2 weight-mostly), using the exact command from the README:

./ds4-bench -m ds4flash.gguf \
  --prompt-file bench/promessi_sposi.txt \
  --ctx-start 2048 --ctx-max 65536 \
  --step-incr 2048 --gen-tokens 128

Five back-to-back sweeps with a passive cool-down between runs to keep one
hot sweep from dominating the headline numbers. The PR ships:

  • bench/m5_max_128gb_report.md — host/model setup, per-run summary, and a
    per-frontier table with mean ± σ across the five runs.
  • bench/results/run{1..5}.csv + bench/results/run{1..5}.log — raw outputs
    so any aggregate in the report can be regenerated.

Headline numbers (n = 5, Metal)

ctx prefill mean (σ) gen mean (σ)
2048 369.7 t/s (1.7) 31.46 t/s (0.07)
16384 261.9 t/s (6.8) 28.50 t/s (0.80)
32768 222.4 t/s (13.5) 24.29 t/s (1.27)
65536 161.5 t/s (5.9) 20.30 t/s (1.13)

Decode falls ~35% from 2k to 64k (31.5 → 20 t/s); prefill plateaus at ~160 t/s
past ~50k. Mid-context rows show the highest run-to-run variance — that window
of every sweep lands in the thermally-loaded phase, and run 4 specifically was
the hottest pass.

Test plan

  • ./ds4-bench invoked with the README long-context command, 5 times
  • Raw CSV/log captured for each run and committed under bench/results/
  • Aggregates in bench/m5_max_128gb_report.md regenerated from the
    committed CSVs

Five back-to-back ds4-bench sweeps (ctx 2048..65536, step 2048,
gen 128) on Apple M5 Max with passive cool-down between runs.
Includes per-run summary, per-frontier aggregates with σ across
runs, and the raw CSV/log artifacts so the report is reproducible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant