Skip to content

notfixingit3/lmspeedtest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LMSpeedTest Logo

Go Version gosec License Buy Me A Coffee

A beautiful, interactive CLI for benchmarking Ollama & LM Studio LLM inference speed


πŸ“Έ Screenshots

Help output

Models listΒ Β  Dashboard

Doctor diagnostics

✨ Features

  • 🎯 Interactive Model Selection β€” Beautiful TUI multi-select with filtering by size and name
  • πŸ“Š Rich Metrics β€” Generation TPS, prompt eval TPS, TTFT, ITL, and model load time
  • 🎨 Styled Terminal Output β€” Powered by Lipgloss and Huh
  • πŸ” Multiple Epochs β€” Run N benchmarks per model, keep the best result
  • πŸ“ Configurable Prompts β€” Code generation, chat, long-form, or custom file-based prompts
  • πŸ“ˆ Export Formats β€” CSV, JSON, or Go benchstat-compatible output
  • πŸ”₯ Batch Mode β€” Benchmark all matching models without interactive selection
  • 🌑️ Automatic Warmup β€” Every model gets a lightweight warmup run before measurement
  • πŸ“‰ Stability Analysis β€” See variance, min/max across epochs
  • 🌐 Web Dashboard β€” Serve results over HTTP with lmspeedtest serve
  • πŸ” Auth Support β€” Bearer token authentication for remote Ollama/LM Studio instances
  • πŸ–₯️ Remote-Friendly β€” Works with local or remote Ollama/LM Studio servers
  • 🏒 Multi-Server Support β€” Manage and benchmark multiple Ollama/LM Studio servers
  • πŸ” Update Checker β€” Check for new versions with a single command
  • 🐚 Shell Completions β€” Bash, Zsh, and Fish completion scripts out of the box
  • πŸ“‹ JSON Output β€” Machine-readable output for models, dashboard, info, and compare commands

πŸš€ Installation

Pre-built Binaries (Recommended)

Download the latest release for your platform from the releases page.

macOS (Intel)

curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_darwin_amd64.tar.gz | tar xz
mv lmspeedtest /usr/local/bin/

macOS (Apple Silicon)

curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_darwin_arm64.tar.gz | tar xz
mv lmspeedtest /usr/local/bin/

Linux (x86_64)

curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_linux_amd64.tar.gz | tar xz
sudo mv lmspeedtest /usr/local/bin/

Linux (ARM64)

curl -L https://github.com/notfixingit3/lmspeedtest/releases/latest/download/lmspeedtest_$(curl -s https://api.github.com/repos/notfixingit3/lmspeedtest/releases/latest | grep tag_name | cut -d '"' -f 4)_linux_arm64.tar.gz | tar xz
sudo mv lmspeedtest /usr/local/bin/

Windows (x86_64) Download lmspeedtest_*_windows_amd64.zip from the releases page, extract, and add to your PATH.

Windows (ARM64 - Snapdragon/Qualcomm) Download lmspeedtest_*_windows_arm64.zip from the releases page, extract, and add to your PATH.

Via Go Install

go install github.com/notfixingit3/lmspeedtest@latest

From Source

git clone https://github.com/notfixingit3/lmspeedtest.git
cd lmspeedtest
go build -o lmspeedtest
chmod +x lmspeedtest

🎬 Quick Start

# Configure your Ollama/LM Studio host
./lmspeedtest connect

# List available models
./lmspeedtest models

# Benchmark models ≀ 8 GB with interactive selection
./lmspeedtest test 8

# View your results dashboard
./lmspeedtest dashboard

πŸ“– Usage

Commands

Command Description
connect Configure Ollama/LM Studio host and optional auth token
connect --add <name> Add a new server profile
connect --list List all configured server profiles
connect --default <name> Set the default server profile
connect --use <name> Switch to a different server profile
connect --remove <name> Remove a server profile
info Show server version, host, and auth status
doctor Run diagnostics: check config, connectivity, and permissions
models [max_gb] [name_filter] List models with metadata (params, quantization)
test <max_gb> [opts] Benchmark matching models
dashboard Show latest results per model
compare <model> Compare all context sizes for a model
completions [shell] Generate shell completion scripts (bash, zsh, fish)
export [--format fmt] Export results (csv, json, benchstat, markdown)
reset Clear all benchmark results
serve [--port N] Start web dashboard (default: 8080)
update Check for available updates
--version Show version information

Test Options

./lmspeedtest test 8                    # Interactive TUI selection
./lmspeedtest test 8 64k                # Use 64k context window
./lmspeedtest test 8 32k qwen           # Filter by name + context
./lmspeedtest test 8 llama,gemma4       # Multi-filter: match any name
./lmspeedtest test 8 --all              # Benchmark all (skip TUI)
./lmspeedtest test 8 --epochs 3         # Run 3 epochs, keep best
./lmspeedtest test 8 --template code    # Code generation prompt
./lmspeedtest test 8 --template chat    # Short chat prompt
./lmspeedtest test 8 --template long    # Long-form writing (default)
./lmspeedtest test 8 --prompt-file path.txt  # Custom prompt from file

Doctor Exit Codes

The doctor command returns specific exit codes for programmatic use:

Code Meaning
0 Pass β€” all checks OK
1 Warnings β€” potential issues detected
2 Config errors β€” invalid or missing configuration
3 Connectivity β€” cannot reach Ollama/LM Studio
4 Permissions β€” file or directory permission issues
5 Data errors β€” corrupted or unreadable results data

Real-World Examples

# Compare all your models at once
./lmspeedtest test 100 --all

# Deep benchmark: 5 epochs with long-form prompt
./lmspeedtest test 16 --epochs 5 --template long

# Quick code generation benchmark
./lmspeedtest test 8 --template code --epochs 3

# Deep benchmark with custom prompt
./lmspeedtest test 16 --epochs 5 --prompt-file prompt.txt

# Multi-filter: benchmark llama and gemma models only
./lmspeedtest test 16 llama,gemma --epochs 3

# Export for statistical analysis
./lmspeedtest export --format benchstat > results.bench
benchstat results.bench

# Start web dashboard
./lmspeedtest serve
./lmspeedtest serve --port 3000

# Clear all results
./lmspeedtest reset

# Connect to remote Ollama/LM Studio with auth
./lmspeedtest connect
# Host: https://ollama.example.com or http://10.1.6.30:1234
# Token: sk-abc123
./lmspeedtest info

# Check for updates
./lmspeedtest update

# Generate shell completions
./lmspeedtest completions bash > /usr/local/etc/bash_completion.d/lmspeedtest
./lmspeedtest completions zsh > /usr/local/share/zsh/site-functions/_lmspeedtest
./lmspeedtest completions fish > ~/.config/fish/completions/lmspeedtest.fish

# Get JSON output for scripting
./lmspeedtest models --json
./lmspeedtest dashboard --json
./lmspeedtest info --json
./lmspeedtest compare llama3.2:latest --json

# Check version
./lmspeedtest --version

# Multi-server: add and switch between servers
./lmspeedtest connect --add desktop --host http://192.168.1.10:11434
./lmspeedtest connect --add laptop --host http://192.168.1.11:11434 --token sk-abc
./lmspeedtest connect --list
# β†’ default (http://localhost:11434) [active]
# β†’ desktop (http://192.168.1.10:11434)
# β†’ laptop (http://192.168.1.11:11434)
./lmspeedtest connect --use desktop
./lmspeedtest test 8 --all
./lmspeedtest connect --use laptop
./lmspeedtest test 8 --all
./lmspeedtest compare llama3.2:latest
# Shows results from all servers with server column

πŸ“Š Metrics Explained

Metric Description
Tokens/sec Generation speed β€” output tokens per second
Prompt TPS Input processing speed β€” prompt eval tokens per second
TTFT Time to first token β€” load + prompt eval duration
ITL Inter-token latency β€” time between consecutive tokens
Load Time Time to load model weights into GPU/CPU memory
Stability Variance across epochs (stddev, min, max)

βš™οΈ Configuration

State is stored in ~/.lmspeedtest/:

  • config.json β€” Server profiles (host URL, auth token, active profile)
  • results.json β€” Last 3 benchmark results per model per server (capped at 3)

Multi-Server Setup

Manage multiple Ollama/LM Studio servers with named profiles:

# Add servers
./lmspeedtest connect --add desktop --host http://192.168.1.10:11434
./lmspeedtest connect --add laptop --host http://192.168.1.11:11434 --token sk-abc

# List profiles
./lmspeedtest connect --list

# Switch active server
./lmspeedtest connect --use desktop

# Remove a profile
./lmspeedtest connect --remove laptop

Results are stored per-server, so you can benchmark the same model on different hardware and compare.


πŸ“‹ Requirements

  • Go 1.25.8+
  • Ollama or LM Studio instance (local or remote)

🀝 Contributing

Contributions are welcome! The codebase is organized into logical files:

  • main.go β€” Entry point and command routing

  • types.go β€” Structs and data models

  • config.go β€” Configuration persistence and migration

  • styles.go β€” Terminal styling helpers

  • commands_connect.go β€” Server profile management

  • commands_models.go β€” Model listing and benchmarking

  • benchmark.go β€” Core benchmark logic

  • commands_other.go β€” Dashboard, compare, export, serve, reset

  • New subcommand β†’ add case in main() switch

  • New API call β†’ add helper in the relevant command file

  • New TUI form β†’ use huh.NewForm(...) pattern

Please ensure your code passes golangci-lint and gosec.


β˜• Support

If you find this tool useful, consider buying me a coffee:

Buy Me A Coffee

πŸ“„ License

MIT License β€” see LICENSE for details.


Made with a fucked-up back and extremely questionable sleep habits by @notfixingit3

About

LMSpeedTest

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages