📚 Documentation

Practical guide for transcribing audio and video files in Audio Note, with the right first-pass choices for Whisper, realtime models, and batch workflows.

📚 DocumentationGuide

File Transcription

Transcription

File transcription parameters screenshot

Screenshot

What This Page Solves

File Transcription is the most reliable starting point in Audio Note. It works best when you need:

audio or video files transcribed after the fact
better accuracy than a live workflow usually provides
exports, review, and repeatable team workflows

If you are not sure where to begin, begin here.

When To Use It

Prefer File Transcription when

accuracy matters more than instant text
you want to process long-form meetings, interviews, lectures, or replay media
you need batch work, exports, or a review step before sharing

Do not start here when

you need live microphone text: use Realtime Microphone Transcription
the source is still on a website: use Link Transcription
you want folders to queue automatically: use Watch Folder

Recommended Workflow

Start with one real sample file.
Choose Whisper or a realtime model, then lock language and export target.
Validate the result before launching a full batch.
Move accepted results into Note for cleanup.
Use AI Chat only after the transcript is trustworthy enough for semantic work.

The goal is not to process everything immediately. The goal is to prove that one parameter set works for this content type.

Key Decisions

1. Whisper or realtime model first

Situation	First choice	Why
archive review, long-form meetings, final-quality output	Whisper	stronger quality ceiling and better fit for editorial review
lower-power device, fast draft, lighter workloads	Realtime model	steadier throughput and friendlier device requirements
strong GPU and still quality-first	Whisper	often worth validating first when hardware is strong

2. What to lock first

For a first production pass, lock these four items before touching advanced parameters:

model route
language
GPU on or off
export target

Only move on to Advanced Parameter Transcription once the baseline is already close to usable.

3. When GPU is worth it

GPU is usually worth testing when you have:

long files
Medium or Large Whisper models
repeatable batch work
a stable GPU runtime on your machine

If you mainly use realtime models, GPU is usually not the first optimization to care about.

Common Mistakes And Troubleshooting

Mixing unrelated languages into one batch Split by language, content type, or downstream use.
Launching a full batch without a sample run Validate one or two files first.
Treating first export as final output Normalize names, numbers, and terminology in Note before sharing.
Changing model, GPU, and parameters at the same time Change one variable at a time so the result stays explainable.

If a task stalls, check in this order:

model download completeness
disk space and cache path health
language and model fit
GPU/runtime stability
source file integrity or container issues