FunASR Model Selection Guide

Use this guide when you are choosing a first model, comparing FunASR with Whisper or a cloud ASR provider, or deciding which model alias to expose through the OpenAI-compatible API.

Fast default path

If you are unsure, start with SenseVoice-Small:

from funasr import AutoModel

model = AutoModel(
    model="iic/SenseVoiceSmall",
    vad_model="fsmn-vad",
    spk_model="cam++",
    device="cuda",  # use "cpu" for a portable smoke test
)
result = model.generate(input="meeting.wav")

It is the best first choice for demos, private APIs, multilingual transcription, speaker-aware meeting transcripts, and agent voice input. Switch only when your workload has a clear requirement such as Mandarin production accuracy, streaming latency, or LLM-based ASR experiments.

Decision table

Need	Start with	Why	Next doc
Fast multilingual private transcription	SenseVoice-Small	Strong default with ASR, emotion tags, audio event tags, and CPU viability.	README quick start
Mandarin production ASR	Paraformer-Large	Mature Chinese ASR path with VAD and punctuation.	Tutorial
English-only route in the OpenAI API example	`paraformer-en` alias	Smaller English route for API compatibility checks.	OpenAI API example
LLM-based ASR or 31-language experiments	Fun-ASR-Nano	LLM-based model path; use vLLM when decoder throughput matters.	vLLM guide
Live captions or call-center streams	Runtime WebSocket service	Designed for long-lived streaming sessions and partial results.	Runtime service docs
Batch archive processing	SenseVoice-Small or Paraformer-Large	Stable offline transcription path; caller owns manifests, retries, and logs.	Batch ASR example
Migration from Whisper/cloud ASR	SenseVoice-Small first, then benchmark alternatives	Gives a strong baseline before deeper model-specific tuning.	Migration guide

OpenAI-compatible API aliases

The examples/openai_api server exposes short aliases so application teams do not need to know model repository IDs:

Alias	Underlying path	Use when
`sensevoice`	`iic/SenseVoiceSmall`	You want the default private speech API with multilingual ASR, event tags, and good CPU/GPU behavior.
`paraformer`	`paraformer-zh` with VAD and punctuation	You want a Mandarin-oriented production route.
`paraformer-en`	`paraformer-en` with VAD	You want a compact English route in OpenAI-style clients.
`fun-asr-nano`	`FunAudioLLM/Fun-ASR-Nano-2512`	You are evaluating LLM-based ASR, 31-language coverage, or vLLM acceleration.

Check the live service before wiring clients:

curl http://localhost:8000/v1/models
python examples/openai_api/smoke_test.py --base-url http://localhost:8000 --model sensevoice

For SDK, JavaScript, workflow, Postman, OpenAPI, Docker, and Kubernetes paths, start from the OpenAI API example.

Runtime choice by workload

Workload	Runtime path	Notes
Notebook or one-off evaluation	Python `AutoModel`	Shortest path for install, model download, and output-shape checks.
Internal HTTP service	OpenAI-compatible API	Reuse OpenAI-style clients, Dify, n8n, LangChain, AutoGen, and HTTP nodes.
Repeatable local container demo	Docker Compose API	CPU-first smoke test; adapt the image before using CUDA.
Internal cluster service	Kubernetes API template	Private `ClusterIP`, persistent model cache, `/health` probes, and port-forward smoke test.
Live audio	Runtime WebSocket service	Validate chunk size, VAD, endpointing, reconnects, and client backpressure with real audio.
LLM-based ASR throughput	vLLM path for Fun-ASR-Nano	vLLM accelerates autoregressive decoding; it does not apply to non-autoregressive Paraformer.

See the deployment matrix when you are choosing between these paths.

Benchmark before committing

Do not choose a model from a single clean demo file. Use a small representative set first:

20-50 audio files that cover short clips, long meetings, silence, noise, overlapping speakers, domain vocabulary, and target languages.
Record model name, model revision, FunASR version, device, CPU/GPU type, CUDA/PyTorch version, runtime path, batch size, and whether warmup/model download time is excluded.
Track quality with your normal WER/CER or human review process, not only transcript readability.
Track latency, throughput, memory, failures, and upload size limits together.
Keep at least one public sample for smoke tests and at least one private realistic sample for deployment validation.

For migration work, use the migration benchmark example and the migration guide.

Practical recommendations

Start with SenseVoice-Small for demos, private APIs, agent voice input, and multilingual workloads.
Use Paraformer when your production traffic is primarily Mandarin and you want the mature non-autoregressive ASR path.
Use Fun-ASR-Nano when you specifically want the LLM-based model path or vLLM acceleration experiments.
Use the streaming runtime when partial results and long-lived connections matter more than a single final transcript.
Keep model aliases stable in production runbooks so benchmark results and bug reports are reproducible.
Open a Deployment Help issue with model, device, command, logs, audio duration, and runtime path when you get stuck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FunASR Model Selection Guide

Fast default path

Decision table

OpenAI-compatible API aliases

Runtime choice by workload

Benchmark before committing

Practical recommendations

FilesExpand file tree

model_selection.md

Latest commit

History

model_selection.md

File metadata and controls

FunASR Model Selection Guide

Fast default path

Decision table

OpenAI-compatible API aliases

Runtime choice by workload

Benchmark before committing

Practical recommendations