📚 Documentation

Capture a selected app's audio in real time, then choose Whisper or realtime models based on latency, hardware, and review goals.

Realtime app transcriptionApp audio captureWhisper

Realtime App Transcription

Realtime

App audio capture screenshot

Screenshot

What This Page Solves

Realtime App Transcription is for app audio you want to read while it is happening, such as:

meetings in Zoom, Teams, or Google Meet
webinars and online classes
app-based demos or replay review sessions

This is a live workflow. It is meant to reduce delay between hearing and reading, then hand the result off to Note for cleanup.

When To Use It

Prefer this workflow when

the source is app audio rather than a local file
you need live text during the session
you still want the result to flow into notes afterward

Do not start here when

you already have the source file: use File Transcription
the main source is your microphone: use Realtime Microphone Transcription
you need a record-first archive: use Recording

Recommended Workflow

Choose the target app before the session starts.
Run a short smoke test for permissions, routing, and text refresh.
Pick Whisper or a realtime model based on latency and hardware.
Keep the setup stable during the session instead of switching devices or models midstream.
After the session, continue in Note for names, action items, and structure cleanup.

Key Decisions

1. Realtime model or Whisper

Situation	First choice	Why
no strong GPU, steady low-latency capture matters most	Realtime model	lighter runtime pressure and better realtime defaults
strong GPU and higher-quality live text target	Whisper	can still deliver good RTF when hardware is strong enough
long sessions on mixed hardware	Realtime model	more forgiving for stability-first live capture

2. What to validate before a real meeting

Validate these first:

app capture permission
correct target app selection
text latency
segmentation stability
whether the result lands correctly in Note afterward

3. Why this is different from file review

In file workflows you can optimize for accuracy first. In realtime app workflows, latency and stability matter earlier. That is why the better model is not always the largest one.

Common Mistakes And Troubleshooting

Granting permissions after the meeting already started Finish permissions and routing checks first.
Switching audio devices during capture Keep routing stable during the session whenever possible.
Assuming realtime means realtime models only Whisper is still valid when GPU headroom is strong enough.
Sharing raw live text directly Do a quick editorial pass in Note first.

Check in this order:

OS capture permissions
correct target app and audio route
model/device fit
GPU/runtime stability when Whisper is involved
background noise or competing audio sources