📚 Documentation
Choose a sensible first model by device performance, latency target, and workload type instead of guessing from model names alone.
📚 DocumentationGuide
Model Usage Recommendations
Settings
Transcription settings overview screenshot
What This Page Solves
This page does not explain every parameter. It answers the more useful question:
What should I try first on this machine, for this kind of task?
Choose By Goal First, Not By Model Name
Start from what matters most:
- accuracy first Start with Whisper.
- lower latency first Start with a realtime model.
- lower-power device, stability first Start with lighter realtime models or lighter Whisper tiers.
- strong GPU, realtime but quality-sensitive Whisper is still worth testing.
| Situation | First choice | Alternative | Usually avoid |
|---|---|---|---|
| file transcription, safe first baseline | Whisper Small / Medium | move up to Large only after sample validation | jumping straight to the heaviest model |
| long-form audio, higher-quality final output | Whisper Large family | Medium plus tuning when hardware is tighter | defaulting to Large on weak hardware |
| realtime microphone without a strong GPU | Realtime model | Whisper after latency validation | heavy Whisper tiers on weak hardware |
| realtime app capture with long live sessions | Realtime model | Whisper if GPU headroom and RTF both validate well | choosing by quality alone and ignoring session stability |
| lower-power devices | Lightweight realtime or Whisper Tiny / Base | upgrade later after real validation | starting with heavy models |
| strong GPU, realtime plus higher quality | Whisper | realtime models when lower latency matters more | assuming realtime models are always better in every live scenario |
Three Rules That Help Most
1. Separate file workflows from realtime workflows
- file workflows usually lean toward Whisper
- realtime workflows usually lean toward realtime models
- strong GPUs can make Whisper viable in realtime too
2. On lower-power devices, stability beats theoretical peak quality
If the machine is modest, the first goal is a workflow that finishes reliably. Optimization comes second.
3. For teams, define defaults before opening power-user routes
A practical team policy is:
- set one common baseline
- add a heavier Whisper lane for critical deliverables
- document sample validation instead of making every teammate rediscover it
Common Mistakes And Troubleshooting
- Trusting public benchmarks more than your own samples Always validate against your real meetings, lectures, or interviews.
- Forcing one model into every workflow At least separate accuracy-first from latency-first use.
- Not re-testing after hardware changes Model choices are strongly hardware-dependent.
- Treating realtime models as realtime-only forever They are usually better for realtime, but can still participate in other workflows.
Read Next
- Terminology and route boundaries: Concepts
- When GPU is worth it: GPU Transcription
- Your first file workflow: File Transcription
- Live microphone workflow: Realtime Microphone Transcription