Install PipeVoice, pick an engine, and start talking into any app on Windows — in about a minute.
On first run, PipeVoice asks how you want to transcribe. You can change this any time from the tray menu.
| Engine | Needs | Best for |
|---|---|---|
| Local (offline) | Nothing — runs on your PC | Privacy & zero cost. First use downloads a small model (~150 MB). |
| OpenAI Whisper | OPENAI_API_KEY | Best accuracy. ~$0.006/min, billed by OpenAI. |
| Deepgram | DEEPGRAM_API_KEY | Lowest latency — words appear live as you speak. |
For the cloud engines, paste your key into the first-run dialog — it's stored locally on your PC and never uploaded. No key? Choose the offline engine and you're running, free.
By default, hold Right Ctrl, speak, and release — your words type in wherever the cursor is. A corner pill shows a live mic meter while it listens.
Right-click the tray icon → Settings. You can rebind the hotkey (with a "Capture" button), switch mode, pick your microphone, choose type-vs-paste output, set a language hint, and add vocabulary for your jargon. Changes apply live.
PipeVoice has no servers of its own and no telemetry. What leaves your computer depends only on the engine you choose:
It's open source — audit it on GitHub.
If the target app runs as administrator (some terminals do), Windows blocks keystrokes from non-elevated apps. Run PipeVoice as administrator too, and they'll match.
The local engine downloads its model (~150 MB) the first time you use it. After that it's cached and fast. Bigger models are more accurate but slower — a GPU helps a lot.
Settings → Microphone. Leave it blank to use the system default, or pick one by name/index.
Confirm the hotkey in Settings, and that an engine is selected (the tray tooltip shows the current engine). For cloud engines, make sure a valid key is saved.
PipeVoice can act as a local MCP server so agents (Claude Code, Cursor, Cline) can use your voice — no API key, everything runs on your machine.
listen — the agent asks a question, you answer by voice, and it gets the text back.transcribe — hand it a local audio or video file and get a transcript with timestamps, or ready-made captions (format: "srt" / "vtt").Enable Agent MCP (listen + transcribe) in the tray menu, then register it once with your client:
claude mcp add pipevoice -- python -m wisprlite --mcpThe server is loopback-only and off by default.