How to Use Voice to Text on Mac Across Every App

If you've searched for how to use voice to text on Mac, you've probably landed on Apple's support page and gotten the basics: press the Microphone key, speak, done. Fine for a quick sentence in Mail. Not enough if you want dictation wired into your actual workflow — Notion, Slack, VS Code, legal documents, whatever you spend real time in.

This guide goes further. We'll cover Apple's built-in dictation, its real limitations, and then show you how to build a cross-app voice workflow using per-app modes, keyboard shortcuts, custom vocabulary, and on-device processing that never sends your audio anywhere.

TL;DR

  • Apple's built-in dictation works in macOS 13+ but routes audio to Apple's servers unless you're on Apple Silicon with Enhanced Dictation enabled.
  • For cross-app workflows, per-app transcription modes and live dictation that types directly into any focused app give you far more control.
  • On-device tools like VoicePrivate process everything locally, with no cloud uploads, no account required, and no internet needed after the initial model download.
Prerequisites: A Mac running macOS 13 or later. A microphone (built-in or external). If you want fully offline, on-device dictation, download VoicePrivate before you start — it requires one model download on first run, then works completely offline forever.

Step 1: Turn On Voice Dictation on Your Mac

How do I turn on voice to text on a Mac?

Go to System Settings > Keyboard > Dictation and toggle Dictation to On. Once enabled, macOS will ask whether to use the microphone key or a custom shortcut to activate it.

Close-up of smartphone screen showing a privacy policy update agreement.

Photo by Rahul Shah on Unsplash

Here's what you're choosing between at this point:

On Apple Silicon Macs (M1 and later), macOS can handle a good chunk of dictation locally. On Intel Macs, standard dictation still phones home. If that distinction matters for your work — and it should — know this before you dictate your first sensitive sentence.

Watch out: macOS Dictation is off by default. If you press the Microphone key and nothing happens, you haven't enabled it yet in System Settings. Don't spend ten minutes troubleshooting the wrong thing.

Step 2: Find the Dictation Key and Set Your Shortcut

Where is the Dictation key on a Mac keyboard?

On most modern Macs, the Dictation key is F5 (or the key with a microphone icon in the function row). On older Macs or external keyboards without that key, macOS defaults to pressing Fn (Function) twice in quick succession.

Text art 'MeToo' on concrete background highlighting social movement themes.

Photo by Lum3n on Unsplash

You can change this. In System Settings > Keyboard > Dictation, look for the "Shortcut" dropdown. Options include:

How do I activate my voice typing?

Once Dictation is on, click into any text field, then trigger your shortcut. A microphone indicator appears near your cursor. Speak naturally. Pause briefly at the end of sentences. Click outside the indicator or press Escape to stop.

For punctuation, say the word: "comma", "period", "question mark", "new line", "new paragraph". For formatting, say "all caps" before a word. For emoji on macOS Ventura and later, say "emoji" followed by the name — "emoji thumbs up", for example.

Pro tip: Set a dedicated keyboard shortcut that doesn't conflict with your most-used apps. Fn+Fn works fine in most apps but breaks in some terminal emulators. If you live in iTerm2 or VS Code, set a custom shortcut like Control+Option+D instead.

Step 3: Understand the Cross-App Limitation — and How to Work Around It

Here's the thing: Apple's dictation is a system-level feature that activates wherever your cursor is. That sounds universal. In practice, it works inconsistently across apps.

A minimalist composition featuring red text on a blue background, perfect for posters and desktop backgrounds.

Photo by Elīna Arāja on Unsplash

Some apps intercept keyboard input in ways that interfere with dictation injection. Electron apps — Slack, VS Code, Notion desktop — and certain web-based interfaces have varying levels of reliability. The dictation overlay appears, you speak, and then the text lands in the wrong place, gets duplicated, or doesn't appear at all.

This is the gap most guides don't address.

The more reliable approach is a tool that runs as a system-level overlay, monitors which app is in focus, and injects text at the cursor position using the macOS Accessibility API. VoicePrivate's live dictation mode does exactly this — it types directly into whatever Mac app is active, treating the target app as a passive text receiver rather than relying on that app's own dictation support.

This matters most in:

Pro tip: Test dictation in your actual target apps before committing to a workflow. Spend two minutes typing a paragraph via dictation in each app you use daily. If text drops, duplicates, or cursor jumps — that app needs a different injection method.

Step 4: Set Up Per-App Transcription Modes

How to do voice transcription on a Mac?

For one-off transcription — converting an existing audio or video file to text — the fastest path is dragging the file into a dedicated transcription app. VoicePrivate supports drag-and-drop file transcription: drop an audio or video file onto the app, and it processes everything locally using its on-device speech recognition engine.

Creative typography art featuring the phrase 'Wien ist schön!' in black and white.

Photo by Marco Sebastian Mueller on Unsplash

Live transcription while you work is a different workflow:

  1. Open VoicePrivate and configure a transcription mode for each app you use regularly.
  2. Per-app modes let you set vocabulary profile, preferred language, and whether to auto-punctuate.
  3. Switch between apps normally. The active mode follows focus.

You're not constantly reconfiguring. Set it once per app, and the right mode loads automatically when that app comes into focus. In practice, this is what makes the per-app approach worth it.

Note: VoicePrivate requires macOS 13 or later. It's optimized for Apple Silicon but fully supports Intel Macs. After the first-run model download, no internet connection is required — ever.

Step 5: Add Custom Vocabulary for Your Domain

Apple's built-in dictation handles everyday English reasonably well. It struggles with domain-specific terms: medical abbreviations, legal citations, financial instrument names, proprietary product names, people's names spelled unconventionally.

Top view of a minimalist workspace featuring Apple keyboard, mouse, and motivational text.

Photo by cottonbro studio on Unsplash

VoicePrivate addresses this two ways:

Custom vocabulary: Add terms the engine should recognize correctly. If you regularly dictate "amortization schedule" or "anterior cruciate ligament" or a client's unusual company name, add it once. It applies globally.

Specialty editions: VoicePrivate ships in five editions — General, Healthcare, Legal, Finance, and Insurance. Each specialty edition comes pre-loaded with domain-specific vocabulary. If you're in healthcare and dictating clinical notes, the Healthcare edition already knows the terminology you use daily. You can review the Healthcare features for a full breakdown.

Pro tip: Even if you're on the General edition, start building your custom vocabulary list from day one. A few dozen terms dramatically improves accuracy for specialized content. Think: client names, product names, acronyms your industry uses that a general model won't catch.

Step 6: Use AI Command Mode to Transform Dictated Text

Most voice-to-text tools stop at transcription. You get a raw transcript and edit it yourself. VoicePrivate includes an AI command mode that lets you transform text with instructions — all processed locally.

Close-up image of hands with red nails typing on a smartphone screen, showcasing modern technology.

Photo by cottonbro studio on Unsplash

Practical examples:

This isn't sending your text to an external API. It runs on-device. No cloud, no account, no data leaving your machine.

For journalists, lawyers, clinicians, or anyone handling sensitive material, this matters more than most guides tell you. You get editing intelligence without the exposure that comes with pasting sensitive content into a browser-based AI tool.

Watch out: AI command mode is a paid feature. The free tier covers basic transcription. If you need AI-assisted text transformation, speaker diarization, longer file support, or expanded export formats, you'll want to look at the paid subscription plans.

Step 7: Export Transcripts in the Right Format for Your Workflow

Matching export format to destination app

Raw transcription is only the start. Where the text goes next determines which export format you need.

VoicePrivate supports five export formats:

If you're transcribing interviews for a content workflow, .md drops cleanly into Notion or Obsidian with paragraph structure intact. If you're captioning video, .srt or .vtt exports plug directly into DaVinci Resolve, Final Cut Pro, or your video hosting platform. Pick the format that removes a step, not adds one.

Note: Expanded export formats (including .json, .srt, and .vtt) are available on paid plans. The free tier supports basic text export.

Step 8: Understand the Privacy Architecture Before You Dictate Sensitive Content

Where does your audio actually go?

This question almost never gets answered directly in voice-to-text guides. Here's the answer for each option:

Apple built-in dictation (standard mode): Audio is sent to Apple's servers for processing. Apple's privacy policy governs retention. The audio does leave your device.

Apple dictation on Apple Silicon (Enhanced mode): Processed on-device for shorter dictation sessions. Doesn't upload audio. The better option if you're on a newer Mac and privacy matters.

Cloud tools (Otter.ai, Google Docs voice typing, etc.): Audio and transcripts are stored on third-party servers. Convenient — right up until it isn't. Once it's uploaded, you're subject to that service's data practices, breach risk, and terms of service changes.

VoicePrivate: 100% on-device processing. Zero cloud uploads. No account required. No telemetry. Your audio and transcripts never leave your machine. That's the full architecture — no asterisks.

If you handle sensitive client information, confidential business discussions, or personal health information, the on-device architecture is worth evaluating carefully against your own requirements. We describe the technical setup; you draw your own conclusions about what that means for your situation. For a detailed breakdown of the privacy architecture in a clinical context, see VoicePrivate Healthcare Privacy.

Pro tip: Even if privacy isn't your primary concern, on-device processing means dictation works on a plane, in a hotel with unreliable Wi-Fi, or anywhere else without a stable internet connection. After the one-time model download, VoicePrivate requires zero internet.

Step 9: Troubleshoot Common Dictation Accuracy Problems

Accuracy varies by use case. That's the honest answer. Here's how to push it higher when you hit problems:

Problem: Misheard common words

Problem: Punctuation not appearing

Problem: Language detection failures

Problem: Text appearing in the wrong place

Watch out: Background noise is the single biggest accuracy killer. A mechanical keyboard, an open window with traffic, or a running fan all degrade recognition. If accuracy suddenly drops, check your acoustic environment before adjusting any settings.

What Two Types of Dictation Does macOS Offer?

A question that comes up often. macOS offers two dictation modes:

  1. Standard Dictation — sends audio to Apple for processing, requires internet, works on all Macs.
  2. Enhanced Dictation / on-device Dictation — processes locally on Apple Silicon Macs, no internet required, handles continuous dictation without a time limit.

Bottom line: on-device dictation on M-series Macs responds faster and keeps audio local. Standard dictation depends on your connection speed and Apple's servers.

For power users who need cross-app reliability, longer sessions, domain vocabulary, or guaranteed offline capability, a dedicated on-device tool fills gaps that either built-in mode leaves open. For more on how these options compare, see Voice to Text for Mac: Speed, Accuracy, and Privacy for Power Users.


Key Takeaways

  • Turn on macOS Dictation in System Settings > Keyboard > Dictation. The Microphone key is F5 on most modern Mac keyboards, or set a custom shortcut to avoid conflicts with your apps.
  • Apple's built-in dictation works for simple use cases but behaves inconsistently in Electron and web-based apps. A system-level tool that injects text via the Accessibility API is more reliable across all apps.
  • Per-app transcription modes, custom vocabulary, and specialty editions (Healthcare, Legal, Finance, Insurance) let you match the transcription configuration to the actual work you're doing.
  • VoicePrivate processes everything on-device with no cloud uploads, no account required, and no internet needed after the initial setup — a meaningful difference from cloud-based alternatives like Otter.ai or Google Docs voice typing.
  • Export formats include .txt, .md, .json, .srt, and .vtt, covering content workflows, developer use cases, and video captioning. Expanded formats are available on paid plans.