Question 1

What AI model is used for transcription?

Accepted Answer

OpenAI Whisper (tiny model, ~40MB) running via Transformers.js. It downloads once and is cached for future use.

Question 2

Which audio formats are supported?

Accepted Answer

MP3, WAV, M4A, OGG, FLAC, and WebM audio files.

Question 3

Is my audio uploaded to any server?

Accepted Answer

No — the Whisper AI model runs entirely in your browser. Your audio never leaves your device.

Question 4

Why is the first transcription slow?

Accepted Answer

The AI model (~40MB) downloads on first use and is cached. Subsequent transcriptions use the cached model and are faster.

Question 5

Can I download subtitles?

Accepted Answer

Yes — download as .txt (plain text) or .srt (subtitle file with timestamps for video players).

Audio Transcription (AI)

Frequently Asked Questions