A beautiful, interactive CLI for benchmarking Ollama & LM Studio LLM inference speed
- π― Interactive Model Selection β Beautiful TUI multi-select with filtering by size and name
- π Rich Metrics β Generation TPS, prompt eval TPS, TTFT, ITL, and model load time
- π¨ Styled Terminal Output β Powered by Lipgloss and Huh
- π Multiple Epochs β Run N benchmarks per model, keep the best result
- π Configurable Prompts β Code generation, chat, long-form, or custom file-based prompts
- π Export Formats β CSV, JSON, or Go benchstat-compatible output
- π₯ Batch Mode β Benchmark all matching models without interactive selection
- π‘οΈ Automatic Warmup β Every model gets a lightweight warmup run before measurement
- π Stability Analysis β See variance, min/max across epochs
- π Web Dashboard β Serve results over HTTP with
lmspeedtest serve - π Auth Support β Bearer token authentication for remote Ollama/LM Studio instances
- π₯οΈ Remote-Friendly β Works with local or remote Ollama/LM Studio servers
- π’ Multi-Server Support β Manage and benchmark multiple Ollama/LM Studio servers
- π Update Checker β Check for new versions with a single command
- π Shell Completions β Bash, Zsh, and Fish completion scripts out of the box
- π JSON Output β Machine-readable output for models, dashboard, info, and compare commands
Download the latest release for your platform from the releases page.
macOS (Intel)
curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_darwin_amd64.tar.gz | tar xz
mv lmspeedtest /usr/local/bin/macOS (Apple Silicon)
curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_darwin_arm64.tar.gz | tar xz
mv lmspeedtest /usr/local/bin/Linux (x86_64)
curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_linux_amd64.tar.gz | tar xz
sudo mv lmspeedtest /usr/local/bin/Linux (ARM64)
curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_linux_arm64.tar.gz | tar xz
sudo mv lmspeedtest /usr/local/bin/Windows (x86_64)
Download lmspeedtest_*_windows_amd64.zip from the releases page, extract, and add to your PATH.
Windows (ARM64 - Snapdragon/Qualcomm)
Download lmspeedtest_*_windows_arm64.zip from the releases page, extract, and add to your PATH.
go install github.com/notfixingit3/lmspeedtest@latestgit clone https://github.com/notfixingit3/lmspeedtest.git
cd lmspeedtest
go build -o lmspeedtest
chmod +x lmspeedtest# Configure your Ollama/LM Studio host
./lmspeedtest connect
# List available models
./lmspeedtest models
# Benchmark models β€ 8 GB with interactive selection
./lmspeedtest test 8
# View your results dashboard
./lmspeedtest dashboard| Command | Description |
|---|---|
connect |
Configure Ollama/LM Studio host and optional auth token |
connect --add <name> |
Add a new server profile |
connect --list |
List all configured server profiles |
connect --default <name> |
Set the default server profile |
connect --use <name> |
Switch to a different server profile |
connect --remove <name> |
Remove a server profile |
info |
Show server version, host, and auth status |
doctor |
Run diagnostics: check config, connectivity, and permissions |
models [max_gb] [name_filter] |
List models with metadata (params, quantization) |
test <max_gb> [opts] |
Benchmark matching models |
dashboard |
Show latest results per model |
compare <model> |
Compare all context sizes for a model |
completions [shell] |
Generate shell completion scripts (bash, zsh, fish) |
export [--format fmt] |
Export results (csv, json, benchstat, markdown) |
reset |
Clear all benchmark results |
serve [--port N] |
Start web dashboard (default: 8080) |
update |
Check for available updates |
--version |
Show version information |
./lmspeedtest test 8 # Interactive TUI selection
./lmspeedtest test 8 64k # Use 64k context window
./lmspeedtest test 8 32k qwen # Filter by name + context
./lmspeedtest test 8 llama,gemma4 # Multi-filter: match any name
./lmspeedtest test 8 --all # Benchmark all (skip TUI)
./lmspeedtest test 8 --epochs 3 # Run 3 epochs, keep best
./lmspeedtest test 8 --template code # Code generation prompt
./lmspeedtest test 8 --template chat # Short chat prompt
./lmspeedtest test 8 --template long # Long-form writing (default)
./lmspeedtest test 8 --prompt-file path.txt # Custom prompt from fileThe doctor command returns specific exit codes for programmatic use:
| Code | Meaning |
|---|---|
0 |
Pass β all checks OK |
1 |
Warnings β potential issues detected |
2 |
Config errors β invalid or missing configuration |
3 |
Connectivity β cannot reach Ollama/LM Studio |
4 |
Permissions β file or directory permission issues |
5 |
Data errors β corrupted or unreadable results data |
# Compare all your models at once
./lmspeedtest test 100 --all
# Deep benchmark: 5 epochs with long-form prompt
./lmspeedtest test 16 --epochs 5 --template long
# Quick code generation benchmark
./lmspeedtest test 8 --template code --epochs 3
# Deep benchmark with custom prompt
./lmspeedtest test 16 --epochs 5 --prompt-file prompt.txt
# Multi-filter: benchmark llama and gemma models only
./lmspeedtest test 16 llama,gemma --epochs 3
# Export for statistical analysis
./lmspeedtest export --format benchstat > results.bench
benchstat results.bench
# Start web dashboard
./lmspeedtest serve
./lmspeedtest serve --port 3000
# Clear all results
./lmspeedtest reset
# Connect to remote Ollama/LM Studio with auth
./lmspeedtest connect
# Host: https://ollama.example.com or http://10.1.6.30:1234
# Token: sk-abc123
./lmspeedtest info
# Check for updates
./lmspeedtest update
# Generate shell completions
./lmspeedtest completions bash > /usr/local/etc/bash_completion.d/lmspeedtest
./lmspeedtest completions zsh > /usr/local/share/zsh/site-functions/_lmspeedtest
./lmspeedtest completions fish > ~/.config/fish/completions/lmspeedtest.fish
# Get JSON output for scripting
./lmspeedtest models --json
./lmspeedtest dashboard --json
./lmspeedtest info --json
./lmspeedtest compare llama3.2:latest --json
# Check version
./lmspeedtest --version
# Multi-server: add and switch between servers
./lmspeedtest connect --add desktop --host http://192.168.1.10:11434
./lmspeedtest connect --add laptop --host http://192.168.1.11:11434 --token sk-abc
./lmspeedtest connect --list
# β default (http://localhost:11434) [active]
# β desktop (http://192.168.1.10:11434)
# β laptop (http://192.168.1.11:11434)
./lmspeedtest connect --use desktop
./lmspeedtest test 8 --all
./lmspeedtest connect --use laptop
./lmspeedtest test 8 --all
./lmspeedtest compare llama3.2:latest
# Shows results from all servers with server column| Metric | Description |
|---|---|
| Tokens/sec | Generation speed β output tokens per second |
| Prompt TPS | Input processing speed β prompt eval tokens per second |
| TTFT | Time to first token β load + prompt eval duration |
| ITL | Inter-token latency β time between consecutive tokens |
| Load Time | Time to load model weights into GPU/CPU memory |
| Stability | Variance across epochs (stddev, min, max) |
State is stored in ~/.lmspeedtest/:
config.jsonβ Server profiles (host URL, auth token, active profile)results.jsonβ Last 3 benchmark results per model per server (capped at 3)
Manage multiple Ollama/LM Studio servers with named profiles:
# Add servers
./lmspeedtest connect --add desktop --host http://192.168.1.10:11434
./lmspeedtest connect --add laptop --host http://192.168.1.11:11434 --token sk-abc
# List profiles
./lmspeedtest connect --list
# Switch active server
./lmspeedtest connect --use desktop
# Remove a profile
./lmspeedtest connect --remove laptopResults are stored per-server, so you can benchmark the same model on different hardware and compare.
- Go 1.25.8+
- Ollama or LM Studio instance (local or remote)
Contributions are welcome! The codebase is organized into logical files:
-
main.goβ Entry point and command routing -
types.goβ Structs and data models -
config.goβ Configuration persistence and migration -
styles.goβ Terminal styling helpers -
commands_connect.goβ Server profile management -
commands_models.goβ Model listing and benchmarking -
benchmark.goβ Core benchmark logic -
commands_other.goβ Dashboard, compare, export, serve, reset -
New subcommand β add case in
main()switch -
New API call β add helper in the relevant command file
-
New TUI form β use
huh.NewForm(...)pattern
Please ensure your code passes golangci-lint and gosec.
If you find this tool useful, consider buying me a coffee:
MIT License β see LICENSE for details.
Made with a fucked-up back and extremely questionable sleep habits by @notfixingit3




