📚 Documentation
Minimal, targeted tuning for Whisper and realtime models when the baseline already works but accuracy, segmentation, or latency still need work.
📚 DocumentationGuide
Advanced Parameter Transcription
Advanced
Advanced transcription tuning screenshot
What This Page Solves
Advanced parameters are for targeted correction, not for first-time setup.
Use them when:
- the model route is already correct, but one class of errors keeps repeating
- the output is close to usable, but not stable enough yet
- you need a deliberate tradeoff between accuracy, segmentation, latency, and stability
When To Read This Page
Read it when
- noisy audio causes hallucinated text
- names, terms, or abbreviations are unstable
- segments are too short or too long
- realtime latency is acceptable but segmentation still feels wrong
Do not start here when
- you have not chosen between Whisper and realtime models yet
- your first baseline is not stable
- you are about to change many parameters at once
Recommended Tuning Order
- Change one parameter at a time, or two at most.
- Re-test on the same sample clip every round.
- Start from the symptom, not from the parameter list.
- Promote only proven settings into a reusable preset.
Whisper: Which Problems It Usually Solves Best
Hallucination and repeated text
- no-speech threshold
- max context
- temperature
Terminology, proper nouns, abbreviations
- prompt
- Beam Search or Greedy
- best-of or beam-size
Only part of the source should be processed
- segment transcription
- offset range
- length limits
Realtime Models: Which Problems They Usually Solve Best
Realtime-model tuning is usually less about decoding strategy and more about segmentation behavior:
- when speech should start
- when a segment should end
- how much padding should be added
- how to avoid fragments that are too short or too long
Typical controls include:
- VAD scene presets
- minimum speech and silence duration
- minimum and maximum segment duration
- pre/post padding and merge gap
- thread count
Internally, current realtime-model inference is carried by MLEngine. In public workflow terms, you only need to understand that these are the models better suited for realtime capture.
Common Mistakes And Troubleshooting
- Expecting advanced params to replace model selection They cannot fix the wrong route choice.
- Changing many controls in one pass You lose the ability to explain the result.
- Copying file-transcription settings directly into realtime workflows Realtime workflows must respect latency and segmentation first.
- Increasing model size every time you see errors Sometimes the problem is language choice, VAD behavior, or source noise, not model size.
Read Next
- Route selection before tuning: Concepts
- Faster model selection by task and device: Model Usage Recommendations
- Core setup map: Settings Overview