Yes, you can do voice typing on Windows with zero internet and no API key. In PipeVoice, choosing Local Whisper for transcription (and optionally Ollama for AI cleanup) keeps every part of the pipeline on your own PC: your audio is transcribed locally, your text is polished locally, and nothing is ever sent to a server.
PipeVoice is a free, open-source, push-to-talk dictation tool for Windows 10 and 11. You hold a hotkey (Ctrl+\ by default, or Right Ctrl), speak, release, and it types real keystrokes into whatever app is focused: a terminal, an editor, your browser, or a chat box. This guide covers the fully offline path and how to get good accuracy on normal hardware.
Why people want offline dictation
There are three common reasons to keep dictation local:
- Privacy. If you dictate sensitive material (legal notes, medical text, client data, source code under NDA), you may not want audio leaving your machine at all. The offline path guarantees it does not.
- No subscription. Cloud dictation tools often charge monthly. The local path in PipeVoice costs nothing: no key, no metered usage, no account.
- No internet required. On a plane, a locked-down corporate laptop, or a flaky connection, cloud transcription simply stops working. Local Whisper does not depend on a network.
If those reasons matter to you, you are in the right place. If you mainly want raw speed, cloud engines still have an edge, and we cover that trade-off below.
The fully offline stack: Local Whisper plus Ollama
PipeVoice splits dictation into two stages, and both have an offline option:
- Transcription turns your speech into text. The offline choice here is Local Whisper (powered by faster-whisper), which runs entirely on your CPU.
- AI polish (called "Flow mode") is optional. It cleans filler words, fixes punctuation, and corrects casing. The offline choice here is Ollama, a local model runner.
Pick Local Whisper for transcription and Ollama for polish, and you have a pipeline that needs no key and no connection. Worth noting: the polish stage only ever receives text, never your audio. So even if you later switch polish to a cloud provider, your voice recording itself stays with whichever transcription engine you chose.
How to enable Local Whisper in PipeVoice
Once PipeVoice is installed, switching to the offline engine takes a moment:
- Open PipeVoice settings and set the transcription engine to Local Whisper.
- The first time you use it, PipeVoice downloads a model (around 150MB). This is a one-time download; after that it works offline.
- Hold your dictation hotkey, speak, and release. The text is typed into your focused app as real keystrokes.
If you want polish on top, install Ollama separately, pull a small model, and select Ollama as your Flow provider. (PipeVoice does not bundle Ollama; it talks to your local install.)
If you have not installed the app yet, grab it here: download PipeVoice for Windows. One honest heads-up: the app is currently unsigned, so Windows SmartScreen shows an "unrecognised app" warning. Click More info, then Run anyway. Code signing is in progress.
Choosing a Whisper model size: speed vs accuracy
Whisper comes in several model sizes. Smaller models are faster and lighter; larger models are more accurate but want more CPU. The default download is small (the ~150MB model), and you can raise the model size in settings if you want better accuracy and can spare the processing time.
General guidance for a typical Windows machine:
| If your priority is | Lean toward | Trade-off |
|---|---|---|
| Fastest response on a modest CPU | A smaller Whisper model | Slightly lower accuracy on hard words |
| Best accuracy, decent CPU available | A larger Whisper model | Longer wait after you release the key |
| Balance of both | A mid-size model | Reasonable on most modern laptops |
Local Whisper is a batch engine: it transcribes after you release the hotkey rather than streaming words live. That means there is a short pause before text appears, and bigger models make that pause longer. Try the default first, then size up only if accuracy is not where you want it.
Adding offline AI cleanup with Ollama
Raw transcription includes your "um"s, false starts, and missing commas. PipeVoice's Flow mode tidies that up. With Ollama as the provider, the cleanup runs locally:
- Removes filler words and false starts.
- Adds sensible punctuation and capitalisation.
- Keeps everything offline, since Ollama runs on your machine.
Because polish operates on text only, it never touches your audio. So the offline cleanup step adds no privacy cost: the recording was already transcribed locally, and the text never leaves either.
What "offline" really guarantees
With Local Whisper plus Ollama, the guarantee is simple: nothing leaves your PC. No audio, no text, no telemetry. PipeVoice has no account system and runs no servers of its own. There is nothing to sign in to and nothing being phoned home. You can verify this for yourself: the project is open source, with the full code and documentation public.
This is the core difference from most dictation tools. Even privacy-conscious cloud services still send your audio somewhere. The local path in PipeVoice sends nothing at all.
When cloud engines are still worth it
Offline is not always the right call. PipeVoice lets you pick your engine per task, and two cloud options exist for good reasons:
| Engine | Runs | Best for | Needs a key? |
|---|---|---|---|
| Deepgram | Cloud (streaming) | Fastest, words appear live as you speak | Yes, your own free key (~pennies/day) |
| OpenAI Whisper | Cloud (batch) | Highest accuracy | Yes, your own OpenAI key |
| Local Whisper | On your PC | Full privacy, no internet, no cost | No |
If you want live, word-by-word feedback, Deepgram streaming is hard to beat. If you want the strongest accuracy and do not mind a batch pass, cloud Whisper leads. Importantly, the cloud engines use your own API key, so audio goes only to the provider you chose, not to PipeVoice. You stay in control of where it lands. For a deeper breakdown, see Deepgram vs Whisper vs OpenAI for dictation.
Hardware tips for good local accuracy
Local Whisper runs on a normal CPU, but a few things help:
- Use a decent microphone. Clean input does more for accuracy than a bigger model. A headset beats a laptop mic across a room.
- Pick the right language and accent. PipeVoice has an accent and language picker (British, US, Australian, Indian, New Zealand English and more). Matching it to how you speak improves results.
- Use the speech notes field for tricky speech. If you have a non-native accent, a stutter, or heavy fillers, the free-text speech notes field lets you describe that so the output handles it better.
- Boost your vocabulary. For jargon, product names, or code identifiers, vocabulary boosting helps the engine get them right.
- Step up the model only if needed. Start with the default model. If accuracy on hard words is lacking and your CPU can handle it, raise the size.
You can also set per-app profiles, so you might run Local Whisper everywhere for privacy but switch to a faster cloud engine in a specific app where speed matters more.
The bottom line
PipeVoice gives Windows users something most dictation tools do not: a genuinely offline path that is free, keyless, and private, plus the option to bring your own cloud key when you want more speed or accuracy. It is open source, Windows-native, and it types into any app, including the terminal and Claude Code. It is Windows only for now (not Mac or Linux), and the installer is currently unsigned, but the local stack does exactly what it says: your voice stays on your machine.
Download PipeVoice and try the offline path. For a side-by-side with a popular cloud tool, see PipeVoice vs Wispr Flow.