Speech-to-text as a Bearer token.
Same engine that powers transcribe.so, exposed as a single HTTP endpoint. One API key, per-minute pricing, retries idempotent by default.
Send your first request
One POST, get back a transcription id. Drops into a script, a Cloudflare Worker, or n8n.
curl -X POST https://transcribe.so/api/v1/transcriptions \
-H "Authorization: Bearer tsk_live_REPLACE_ME" \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{"source":"external_url","url":"https://example.com/podcast.mp3","pipeline_code":"qwen3-asr-flash-filetrans"}'More languages and platform recipes in the cookbook.
Don't want to write code? Use it from ChatGPT or Claude.
Same engine, same per-minute pricing, your wallet. Paste a YouTube link or audio URL into our public GPT or Claude Connector. Sign in once, stay signed in.
Pricing
Same per-minute rates as the dashboard. Billed against your wallet, monthly credit first, then top-up balance. No separate API quota, no surprise minimums.
| Model | Per minute | Per hour | Pipeline code |
|---|---|---|---|
| Qwen Flashdefault · cheapest | $0.0176 | $1.05 | qwen3-asr-flash-filetrans |
| Voxtral Minibalanced | $0.0187 | $1.12 | voxtral-mini-transcribe |
| GPT-4obest diarization | $0.0538 | $3.23 | gpt-4o-transcribe-diarize |
See full pricing details, free tier, and team plans on the pricing page.
Built for developers
Bearer auth, nothing else
One header. No cookies, no CSRF, no SDK. curl works.
Idempotent retries
Send the same Idempotency-Key twice, get the same response twice. Retries are safe.
Async with polling
Submit, get a transcription id, poll GET /transcriptions/:id until it's done. Webhooks coming next.
Multilingual + diarization
Whisper, Qwen3-ASR-Flash, Voxtral, GPT-4o-transcribe-diarize. Pick the model per request, the same ones you use in the dashboard.
Two upload modes
Presigned S3 PUT for one-shot uploads, resumable tus for large files or flaky networks. Both feed the same POST /transcriptions.
Ready to ship?
Create a key, paste it into your script, and you're transcribing inside a minute. No SDK to install. Bearer auth and you're done.