📚 Documentation
Understand when GPU acceleration helps, which backend is typical on each platform, and when CPU is still the safer choice.
GPU accelerationCUDAVulkan
GPU Transcription
Performance
GPU and runtime settings screenshot
What This Page Solves
GPU should not be enabled just because it exists. It is most useful when it solves a real bottleneck:
- long-form processing time
- heavier Whisper models
- stronger realtime performance for Whisper on capable machines
This page helps answer: when is GPU worth enabling, and when is CPU still the safer default?
When GPU Is Worth Testing
Strong candidates for GPU
- long audio or video files
- Medium or Large Whisper models
- repeated batch work
- realtime Whisper workflows on strong hardware
Cases where GPU may not be the first priority
- your main route is realtime models
- the workload is short and light
- the first goal is simply getting a stable baseline
- the GPU runtime on the machine is not reliable yet
Typical Platform Choices
| Platform or device | Common path | Why |
|---|---|---|
| Windows + NVIDIA | CUDA | usually the most direct performance path |
| Windows + non-NVIDIA or broader compatibility needs | Vulkan | wider hardware coverage |
| macOS | CoreML | the typical Apple-device acceleration route |
A Safer Validation Method
- Use the same file, language, and model on CPU and GPU.
- Measure both time and failure rate.
- Test short and long files separately.
- Keep GPU as default only if the speed gain is consistent and operational cost stays low.
That avoids false optimization where peak speed looks better but real-world reliability gets worse.
Common Mistakes And Troubleshooting
- Turning on GPU and immediately raising full concurrency Validate single-task stability first.
- Judging compatibility by GPU model alone Drivers, runtimes, and OS state matter too.
- Giving up on Whisper after one GPU failure Fall back to CPU and separate runtime problems from model-route problems.
- Treating GPU as required for realtime models Realtime models usually solve a different problem and often do not depend on GPU.
Read Next
- Route selection before acceleration: Concepts
- Device- and task-based model choice: Model Usage Recommendations
- Higher-quality file workflows: File Transcription
- Parameter tuning after the route is stable: Advanced Parameter Transcription