Real-time Microphone Transcription

TODO (Screenshot Replacement): Realtime microphone transcription page (App 2.0) Include: device selector, model selector, VAD toggle, live transcript area, and start/pause/stop controls. Suggested filename:
realtime-microphone-v2-en.png
Scope
This workflow captures microphone input and outputs text in real time, with:
- Whisper or realtime model selection
- Optional VAD segmentation
- In-session text stream feedback
- Post-session handoff to Note and export
It does not capture system-wide app audio. For that, see Global Realtime (Beta).
Use Cases
- Personal meeting capture
- Lecture note-taking
- Spoken drafting and brainstorming
Steps
- Open
Real-time Transcriptionfrom Home. - Select model and language.
- Select microphone and verify OS microphone permission.
- Configure options (GPU, VAD, translation).
- Start and monitor realtime output.
- Stop and continue processing in the Note page.
Term Explanations
- VAD (Voice Activity Detection): detects speech/non-speech segments for cleaner chunking.
- Realtime model: optimized for low-latency response rather than maximum offline accuracy.
- In-session stream: interim live text; final review in Note is still recommended.
Real Scenario: 30-minute Product Review Meeting
- Before the meeting, lock microphone, language, and model (usually realtime-first).
- During the meeting, avoid heavy inline editing and keep notes lightweight.
- After the meeting, move to Note Workspace for terminology cleanup, summary, and export.
This pattern usually delivers better outcomes than trying to perfect transcripts while the discussion is still live.
Common Mistakes
- Mistake 1: Switching models repeatedly during a live session.
Fix: keep model settings stable during capture and optimize afterward. - Mistake 2: Treating live stream text as final copy.
Fix: use Note review as the final quality gate before sharing. - Mistake 3: Ignoring microphone input chain issues.
Fix: check permissions, device routing, and input level before tuning models.
FAQ
Q: Can I pause and resume?
A: Yes, pause/resume is supported during a session.
Q: Why does microphone transcription fail to start?
A: Check microphone permission, model availability, and device performance.
Q: Can I export output?
A: Yes, export is available after session completion via Note workflows.
Limitations
- Realtime quality depends on microphone quality and ambient noise.
- Some advanced options and models may depend on subscription entitlements.
- On low-end hardware, realtime models are usually the safer default.
- Platform: Windows and macOS are both supported, with differences mainly in permission dialogs and audio driver chains.
Contact us