Last updated: 2025-05-09

File Transcription

Welcome to the file transcription feature! File transcription converts audio/video files into text, with the following characteristics:

  1. Multi-format support
  2. Multilingual support
  3. Supports exporting transcribed text as subtitles
  4. Translate transcribed text
  5. Edit transcribed text (delete, merge, replace text, etc.)
  6. View transcription history

File transcription does not require uploading files to the server. All operations are performed locally, ensuring data security and privacy.

Supported File Formats

Audio Note supports various audio/video file formats:

  • Audio: MP3, WAV, M4A, FLAC, ACC
  • Video: MP4, AVI, MOV, MKV, FLV

The above lists only some commonly used audio/video file formats. More formats can be tested by users.

Model Support

Audio Note supports all official Whisper models, as well as some community models (if you have recommended community models, you can submit feedback to us, and we will consider integrating them in the future).

Before transcription, you may need to download the corresponding model in Settings - Models.

Transcription Languages

OpenAI has trained on over 98 languages, but it's important to note that their WER (Word Error Rate) varies. Some languages have low WER, while others have high WER (too high may result in inaccurate transcription). Therefore, it is recommended to use the following languages for transcription:

Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.

Although our application allows selecting languages other than those listed above, the WER output by the model may be higher, resulting in lower quality.

Whisper-Powered Live Transcription: Capture Speech from Mic, Apps & Media Files in Real Time

Contact us

Email
Copyright © 2025. Made by AudioNote, All rights reserved.