How it works

It's three steps. That's it.

A hotkey, a microphone, a text field. Everything else is hidden until you actually need it.

The flow

Press, speak, paste.

01

Press the hotkey

⌘ Shift Space

Default: ⌘ Shift Space on Mac, Ctrl Shift Space on Windows. Customisable in Settings. The pill at the bottom of your screen wakes up — neutral colour first, then turning red as recording starts. Your cursor stays exactly where you were typing.
02

Speak naturally

Talk like you are explaining something to a colleague. Don't dictate punctuation — it figures out commas, periods, question marks from your speech. Long silences get trimmed automatically (that's our VAD doing its job); restarts and "ums" you'll fix in the same edit pass you'd do after typing.
03

Release, paste

⌘ Shift Space

Hit the hotkey again. The pill turns purple while transcribing — typically under a second on a GPU, 1–3 seconds on a modern CPU. Then the text lands. Wherever your cursor was, however many words you said.

Behind the scenes

How offline dictation actually works.

The technical bits, in plain language. If you skim past this — that's the point. Nothing here changes how you use the app.

01 — Engine

whisper.cpp running on your machine

We use whisper.cpp, the open-source C++ port of OpenAI's Whisper. It runs entirely on your CPU or GPU — no internet connection required after the model is downloaded. Audio gets processed in RAM and discarded the moment we have your text. Nothing on disk, nothing on a server.

whisper.cpp on GitHub →

02 — Silence

Silero VAD trims the silence

Voice Activity Detection (VAD) decides what is speech and what is not before Whisper sees it. Without it, Whisper hallucinates words from background noise — the classic "thank you for watching" appearing out of nowhere. Silero VAD runs in milliseconds and cuts the silent gaps before they reach the model.

03 — Hardware

GPU when you have it, CPU when you don't

On Windows we use Vulkan; on Mac we use Metal. Both are auto-detected and work with whatever GPU you have — NVIDIA, AMD, Intel Arc, Apple Silicon. No drivers to install. If your machine doesn't have a usable GPU, we fall back to CPU. The compact models run fine on a regular CPU at 1–3 seconds for short phrases.

Modes

One hotkey, multiple personalities.

A mode is a saved combination: which model, which language, which dictionary, which snippets. Switch between them right from the pill.

Code

Locked to English. Dictionary loaded with kubectl, gRPC, async/await, your team's API names. Zero misrecognitions on technical jargon you use daily.

Russian writing

Locked to RU. Dictionary loaded with character names from your novel. Snippets ready for chapter headings and recurring phrases.

Slack quick

Compact model for instant turnaround. Snippets for your standup template, your meeting-decline template, your /sig signature.

Dictionary & snippets

Replace and expand on the fly.

Dictionary

Word-level corrections

Tell SnailText that 'see plus plus' should always become C++. Or that 'Алёша' — your colleague's name — shouldn't get autocorrected to 'Аляска'. Word-boundary aware, case-preserving. No regex needed.

say "see plus plus"

→ C++

Snippets

Voice-triggered templates

Voice triggers that expand into longer text. Say 'slash sig' and your full email signature lands. Say 'slash standup' and your morning standup template lands. Trigger words don't appear in the final output.

say "slash sig"

→ Best,
Pavel
—
Software Architect

Errors

It will misrecognise something. You're in control.

Whisper is good. It's not perfect. Long technical phrases, rare names, unusual jargon — it can stumble. Three things help.

Custom dictionary

Add your bug words once; they stop being mistakes. Stack-specific vocabulary, colleagues' names, project codenames — all go in once and behave forever.

Bigger Pro models

The accuracy curve is real — the advanced local models catch what the compact ones miss, especially in non-English languages and on long technical phrases.

Manual edit after paste

The text lands in your normal editor — your cursor, your keyboard, your usual edit shortcuts. Fix anything you don't like the same way you'd fix any typo.

There's no AI auto-edit between you and the text. What you said is what gets pasted.

That's it

That's the whole product.

About thirty seconds to install. A couple more minutes to get used to the hotkey. Then a hotkey for the rest of your life.

Download for Mac Or for Windows →

On the fence? See pricing →