📚 Documentation
Capture a selected app's audio in real time, then choose Whisper or realtime models based on latency, hardware, and review goals.
Realtime app transcriptionApp audio captureWhisper
Realtime App Transcription
Realtime
App audio capture screenshot
What This Page Solves
Realtime App Transcription is for app audio you want to read while it is happening, such as:
- meetings in Zoom, Teams, or Google Meet
- webinars and online classes
- app-based demos or replay review sessions
This is a live workflow. It is meant to reduce delay between hearing and reading, then hand the result off to Note for cleanup.
When To Use It
Prefer this workflow when
- the source is app audio rather than a local file
- you need live text during the session
- you still want the result to flow into notes afterward
Do not start here when
- you already have the source file: use File Transcription
- the main source is your microphone: use Realtime Microphone Transcription
- you need a record-first archive: use Recording
Recommended Workflow
- Choose the target app before the session starts.
- Run a short smoke test for permissions, routing, and text refresh.
- Pick Whisper or a realtime model based on latency and hardware.
- Keep the setup stable during the session instead of switching devices or models midstream.
- After the session, continue in Note for names, action items, and structure cleanup.
Key Decisions
1. Realtime model or Whisper
| Situation | First choice | Why |
|---|---|---|
| no strong GPU, steady low-latency capture matters most | Realtime model | lighter runtime pressure and better realtime defaults |
| strong GPU and higher-quality live text target | Whisper | can still deliver good RTF when hardware is strong enough |
| long sessions on mixed hardware | Realtime model | more forgiving for stability-first live capture |
2. What to validate before a real meeting
Validate these first:
- app capture permission
- correct target app selection
- text latency
- segmentation stability
- whether the result lands correctly in Note afterward
3. Why this is different from file review
In file workflows you can optimize for accuracy first. In realtime app workflows, latency and stability matter earlier. That is why the better model is not always the largest one.
Common Mistakes And Troubleshooting
- Granting permissions after the meeting already started Finish permissions and routing checks first.
- Switching audio devices during capture Keep routing stable during the session whenever possible.
- Assuming realtime means realtime models only Whisper is still valid when GPU headroom is strong enough.
- Sharing raw live text directly Do a quick editorial pass in Note first.
Check in this order:
- OS capture permissions
- correct target app and audio route
- model/device fit
- GPU/runtime stability when Whisper is involved
- background noise or competing audio sources
Read Next
- Microphone-first live capture: Realtime Microphone Transcription
- Record-first session capture: Recording
- Model route guidance: Model Usage Recommendations
- Tuning after the baseline works: Advanced Parameter Transcription