Speech to Text — Vosk (local)

Streaming offline speech recognition via the Kaldi-based Vosk engine, compiled to WebAssembly.

Truly local. Audio is processed by vosk-browser running entirely in WASM in this tab. No audio is sent to any server. Models download once and are cached.

1. Pick a language and load the model

not loaded

2. Listen

idle
mic level

Transcript

transcript will appear here. partial results appear in yellow as you speak; final results lock in at pauses.
0 words

Diagnostic log



How this differs from the Whisper version