SnailText
EN

How it works

It's three steps. That's it.

A hotkey, a microphone, a text field. Everything else is hidden until you actually need it.

The flow

Press, speak, paste.

  1. 01

    Press the hotkey

    Space

    Default: ⌥ Space on Mac, Ctrl Space on Windows. Customisable in Settings. The pill at the bottom of your screen wakes up — neutral colour first, then turning red as recording starts. Your cursor stays exactly where you were typing.

  2. 02

    Speak naturally

    Talk like you are explaining something to a colleague. Don't dictate punctuation — it figures out commas, periods, question marks from your speech. Long silences get trimmed automatically (that's our VAD doing its job); restarts and "ums" you'll fix in the same edit pass you'd do after typing.

  3. 03

    Release, paste

    Space

    Hit the hotkey again. The pill turns purple while transcribing — typically under a second on a GPU, 1–3 seconds on a modern CPU. Then the text lands. Wherever your cursor was, however many words you said.

Behind the scenes

How offline dictation actually works.

The technical bits, in plain language. If you skim past this — that's the point. Nothing here changes how you use the app.

01 — Engine

whisper.cpp running on your machine

We use whisper.cpp, the open-source C++ port of OpenAI's Whisper. It runs entirely on your CPU or GPU — no internet connection required after the model is downloaded. Audio gets processed in RAM and discarded the moment we have your text. Nothing on disk, nothing on a server.

whisper.cpp on GitHub →

02 — Silence

Silero VAD trims the silence

Voice Activity Detection (VAD) decides what is speech and what is not before Whisper sees it. Without it, Whisper hallucinates words from background noise — the classic "thank you for watching" appearing out of nowhere. Silero VAD runs in milliseconds and cuts the silent gaps before they reach the model.

03 — Hardware

GPU when you have it, CPU when you don't

On Windows we use Vulkan; on Mac we use Metal. Both are auto-detected and work with whatever GPU you have — NVIDIA, AMD, Intel Arc, Apple Silicon. No drivers to install. If your machine doesn't have a usable GPU, we fall back to CPU. The compact models run fine on a regular CPU at 1–3 seconds for short phrases.

Want to go deeper? Read Offline dictation — voice typing without the cloud for the architectural argument, GDPR and HIPAA implications, and how to verify any dictation app is actually offline. For Mac specifics, see Dictation for Mac.

The second model

Then a language model cleans it up.

Speech-to-text gives you a raw transcript. A second model, running locally too, edits it into something you can send. That second step is what makes this AI dictation. It is a Pro feature, in beta, and you can turn it off.

01 — Cleanup

Filler out, punctuation in

A compact Gemma model reads the transcript and tidies it: "um" and "you know" dropped, punctuation and capitalization repaired, grammar slips fixed, brand names cased right (github becomes GitHub). It runs on your hardware, so the transcript is never uploaded the way it is in cloud AI dictation tools.

Gemma on ai.google.dev →

02 — Profiles

Five topic profiles, your choice of style

Pick a profile that matches what you dictate: General, Development & IT, Writing, Business, or Academic. The Development profile restores code identifiers in your convention (snake_case, camelCase, kebab-case, PascalCase), so "recording completed" becomes recording_completed. Writing preserves your voice and skips identifier rewriting entirely.

03 — Control

Conservative by default, off when you want

The cleanup is tuned to preserve your meaning, not rewrite it, and it leaves text alone when it is already clean. It can also shift tone or translate. Want the exact words instead? Turn the step off and get plain verbatim speech-to-text. AI dictation is a mode you switch on, not a filter you are stuck with.

What is AI dictation? →

Modes

One hotkey, multiple personalities.

A mode is a saved combination: which model, which language, which dictionary, which snippets. Switch between them right from the pill.

Code

Locked to English. Dictionary loaded with kubectl, gRPC, async/await, your team's API names. Zero misrecognitions on technical jargon you use daily.

Long-form drafting

Bigger model for higher accuracy. Dictionary loaded with names and terms from your project. Snippets ready for headings, callouts, and recurring phrases.

Slack quick

Compact model for instant turnaround. Snippets for your standup template, your meeting-decline template, your /sig signature.

Modes themselves (model + language) work on every tier. Dictionary and snippet auto-replace are Pro features.

Dictionary & snippets Pro

Replace and expand on the fly.

Auto-replace runs during transcription on the Pro tier.

Dictionary

Word-level corrections

Tell SnailText that 'see plus plus' should always become C++. Or that 'k eight s' should expand to k8s, not 'kates'. Custom mappings for technical jargon, product names, or coworker names that Whisper keeps mishearing. Word-boundary aware, case-preserving. No regex needed.

say "see plus plus"
C++

Snippets

Voice-triggered templates

Voice triggers that expand into longer text. Say 'slash sig' and your full email signature lands. Say 'slash standup' and your morning standup template lands. Trigger words don't appear in the final output.

say "slash sig"
Best,
Pavel

Software Architect

Errors

It will misrecognise something. You're in control.

Whisper is good. It's not perfect. Long technical phrases, rare names, unusual jargon — it can stumble. Three things help.

01

Custom dictionary (Pro)

Add your bug words once; they stop being mistakes. Stack-specific vocabulary, colleagues' names, project codenames — all go in once and behave forever. Auto-replace runs on the Pro tier.

02

Bigger Pro models

The accuracy curve is real — the advanced local models catch what the compact ones miss, especially in non-English languages and on long technical phrases.

03

Manual edit after paste

The text lands in your normal editor — your cursor, your keyboard, your usual edit shortcuts. Fix anything you don't like the same way you'd fix any typo.

There's no AI auto-edit between you and the text. What you said is what gets pasted.

That's it

That's the whole product.

About thirty seconds to install. A couple more minutes to get used to the hotkey. Then a hotkey for the rest of your life.

On the fence? See pricing →