📚 Documentation
Last updated: 2026-02-08

Real-time App Transcription

TODO (Screenshot Replacement): Realtime app transcription page (App 2.0) Include: app selector, optional mic overlay toggle, live caption area, and runtime state bar. Suggested filename: realtime-app-v2-en.png

Scope

Realtime App Transcription captures audio from a selected application and transcribes it in-session:

  • Per-app audio capture
  • Optional microphone capture
  • Live transcription feedback
  • Post-session transition to Note workflows

For URL workflows, use Link Transcription instead.

Use Cases

  • Zoom/Teams/Meet sessions
  • Online lectures and webinars
  • Game stream commentary capture

Steps

  1. Open Real-time App Transcription from Home.
  2. Select model/language and grant capture permissions.
  3. Pick the target app from the app list.
  4. Optionally enable microphone capture.
  5. Start session and monitor live output.
  6. Stop and continue editing/exporting in Note.

Preflight Checklist

  • Close unnecessary audio sources to reduce mixer interference.
  • Keep the target app in a capturable window layer (especially on macOS).
  • Prefer stable output/input devices and avoid device hot-switching mid-session.
  • Run a short 3–5 minute validation session before long meetings.

Term Explanations

  • Per-app capture: targets selected app audio, not every system sound.
  • System mixer: OS routing can merge streams and blur strict app boundaries.
  • Spaces (macOS): different desktop layers can impact capture availability.

Real Scenario: Live Meeting Handoff

A common goal is producing a shareable meeting transcript within 10 minutes after the call.

  1. Run a 2–3 minute pre-meeting smoke test (capture target + live text update).
  2. Keep only essential audio sources active during the session.
  3. After the meeting, fix names/action items in Note before distribution.

FAQ

Q: Can I transcribe multiple apps at once?
A: The typical workflow is one selected app per session.

Q: Why is unrelated audio captured?
A: OS routing and mixer behavior can affect capture boundaries on some setups.

Q: Why does full-screen app capture fail on macOS?
A: macOS security constraints may block capture across different Spaces/layers.

Common Mistakes

  • Mistake: requesting permissions after the meeting starts
    Better: complete permission and device checks before the session.
  • Mistake: hot-switching audio devices mid-session
    Better: keep stable devices during capture; switch only after completion.
  • Mistake: sharing raw live captions directly
    Better: do a quick 3–5 minute editorial pass in Note first.

Limitations

  • Status: Stable (non-Beta), with feature-gated rollout possible by account.
  • Requires OS-level capture permissions and stable device routing.
  • Feature availability may depend on account entitlements.
  • Performance and latency vary by model, hardware, and system load.
  • Platform: Windows and macOS are both supported; differences are mainly in routing and window-layer permissions.
Whisper-Powered Live Transcription: Capture Speech from Mic, Apps & Media Files in Real Time

Contact us

Email
Copyright © 2026. Made by AudioNote, All rights reserved.