Transcribe Audio & Video — Free & Private
Turn speech into text and subtitles with on-device AI. Your file is transcribed entirely in your browser — nothing is ever uploaded.
How it works
Drop in an audio or video file
On-device AI transcribes it locally
Copy the text or download .txt / .srt
About this tool
Our free transcription tool turns spoken audio into text — and into ready-to-use subtitles — using OpenAI's Whisper speech model running entirely inside your browser. It's built for podcasters, students transcribing lectures, journalists working through interviews, and creators who need captions for a video. Unlike the typical “free” transcription site, your recording is never uploaded to a server: the AI model downloads to your device once and then does all the work locally, which is why it's safe for confidential interviews and private recordings, and why there are no per-minute limits or watermarks. Download the plain-text transcript or a timestamped .srt subtitle file. A modern device with WebGPU runs it fastest, but it works on any current browser. This is an early version — accuracy is best on clear speech, and very long files take longer on slower machines.
Why use this tool
Speech to text + subtitles
Get a clean transcript and a timestamped .srt subtitle file from any audio or video with speech.
Private by default
The AI model runs on your device — your recording is never uploaded, so it's safe for confidential interviews.
No limits, no signup
No per-minute caps and no watermark; the model is cached after the first run so it starts instantly next time.
Common use cases
- Transcribe an interview or podcast episode
- Caption a video with a downloadable .srt
- Turn a lecture or meeting recording into notes
- Pull quotes from a voice memo
Frequently asked questions
No. The AI speech model runs entirely in your browser, so your file is never uploaded. The model is served from this site itself (not a third-party) and cached after the first use — nothing goes to an external provider.
Common audio (MP3, WAV, M4A, OGG) and video (MP4, WebM, MOV) files with an audio track. The audio is decoded locally and fed to the model.
Yes. Alongside the plain-text transcript, you can download a timestamped .srt subtitle file ready to load into a video player or editor.
The first time you transcribe, the AI model downloads to your browser (one time only). After that it's cached and starts instantly. A device with WebGPU (most modern desktops) is significantly faster.
It uses OpenAI's Whisper model, which is strong on clear speech in many languages. Accuracy drops with heavy background noise, overlapping speakers or very low-quality audio.
Yes. No signup, no per-minute caps and no watermark. Because it runs on your own device, there are no server costs to pass on.