Turn lectures, podcasts, and videos into a searchable library. In any language.

Paste a YouTube link or upload audio. Ask anything across your whole library and jump to the exact second they said it. Cited answers, chapters, and export-ready subtitles included.

A simple HTTP API lets you plug the same engine into AI agents, automations, meeting bots, and your own apps. Read the API docs → Try it on ChatGPT →

No credit card required.·Pay only for what you use.·50+ languages.

Try this real transcript

44 Harsh Truths About The Game Of Life - Naval Ravikant (4K)

Chris Williamson

Contents

8 chapters · 513 sections

1Happiness Versus Success: Philosophical Reflections on Contentment, Desire, and Motivation

2Optimizing Sleep: Smart Temperature Regulation and the Foundations of Self-Esteem

3Decisive Action and Iterative Practice: Keys to Optimal Choices and Mastery

4Wealth Management: From Materialism to Value Creation and Fair Compensation

5Evaluating LLMs: Capabilities, Limitations, and Their Role in AI's Evolving Landscape

6Pathogens, Evolution, and Knowledge: How Humans Adapt and Defend

7Agency, Power, and the Individual: From Child Development to Cultural Conflict

8Unseen Trends: Media Oversights, Medical Limitations, and the Primitive State of Modern Biology

Q&A preview

Answer

Naval explains two distinct paths to happiness using the story of Alexander and Diogenes. The first path is through success—conquering the world, satisfying material needs, and getting what you want. The second path, exemplified by Diogenes living in a barrel, is simply not wanting in the first place. As Socrates said when shown luxuries: 'How many things there are in this world that I do not want.' Naval suggests not wanting something is as good as having it—both paths lead to the same destination of contentment [00:38–01:10]. He's not sure which path is more valid, noting it depends on how you define success [01:10–01:25].

OpenAIQwenMistral

Works with

YouTube

Google Meet

Zoom

Microsoft Teams

Loom

Voice Memos

Video files

Audio files

Export to

CapCut

Final Cut Pro

Premiere Pro

DaVinci Resolve

Copy to

Notion

Apple Notes

Google Keep

OneNote

Evernote

Obsidian

Slack

Supports MP4, MOV, WebM, MP3, WAV, M4A, AAC, FLAC, OGG, and more.

YouTube to textSpeech to textAudio to textVideo to textVoice note to textGoogle Meet to textLoom to textLecture video to notesSubtitle generatorSearchable transcripts

How it works

Three steps from a recording to a searchable library

Real screenshots from a real transcript. No mockups, no marketing fluff. This is what you get.

1
Drop in your audio
Paste a link or upload a file. Anything you record, download, or screen-capture works.
2
Get a structured transcript
Every transcript comes back with chapters, topic summaries, and timestamps so you can jump straight to what matters.
3
Ask anything across your library
Ask a question and get a cited answer with a timestamp. Search across every transcript in your library. Export subtitles if you need them.

No credit card required. Pay only for what you use.

The real cost of long videos

You remember the answer is in there. You just can't find it.

Long lectures and podcasts hold the exact moment you need. The summary won't show it. The timeline won't either. So you scrub, overshoot, and start the video over.

And when the video is in a language your tool barely understands, the transcript is wrong before you even start looking.

Per long video

3-hour lecture you need to study

+ 20–40 min scrubbing to find one explanation

+ Replaying the same 90-second clip three times

+ A summary that skipped the part you wanted

= You learned less than the hours you put in

Per week

5 long videos to get through

+ 20–30 min hunting for moments in each

= 1.5–2.5 hours of study time gone every week

Library, not a one-off transcript

Build a knowledge library you can actually search

Most tools give you one transcript at a time. transcribe.so turns every lecture, podcast, and video into a searchable, askable library that grows with you.

One searchable library across everything

Every YouTube link, lecture, and podcast joins one library. Find a quote across hours of content in seconds.

Ask anything. Jump to the second they said it.

Ask a question and get a cited answer with a timestamp. Click the citation to jump straight to the moment in playback.

Works in any language, automatically

67 languages with measured accuracy per language. We pick the right speech-to-text engine for you, so you focus on studying.

That's roughly 1.5–2.5 hours of study time back every week.

No credit card required. Pay only for what you use.

For power users

Under the hood: the speech-to-text engines we route to

You don't need to pick a model. We route each file to the right engine for your language automatically. If you want to override the default, here's what powers your library: GPT-4o, Qwen3-ASR-Flash, and Voxtral. Chapters, library search, cited Q&A, subtitles, and exports work the same way across all of them.

Premium

GPT-4o Transcribe Diarize

Best-in-class diarization with built-in speaker labels

Built-in speaker identification (who said what)

58 languages, sentence timestamps

Hosted by OpenAI for enterprise reliability

OpenAI

Top-Tier

Qwen3-ASR-Flash

Leaderboard-leading accuracy with word-level timestamps

#1 on HuggingFace Open ASR Leaderboard (4.25% avg WER)

33 languages, word timestamps (10 langs)

Emotion detection, long-form audio

Alibaba Qwen3

New

Voxtral Mini Transcribe

Word-level timestamps with speaker labels

Word-level timestamps in 13 languages

Speaker labels & context biasing

13 languages, lowest cost per minute

MMistral AI

Search backbone

Semantic Search & AI Q&A

Powers search by meaning and AI Q&A across every transcript, no matter which ASR model produced it.

Hybrid retrieval with second-stage reranking

Citation-grounded answers with timestamps

Find moments by meaning, not just keywords

Frontier embedding + LLM stack

Learn more: GPT-4o Transcribe · Qwen3-ASR · Voxtral Mini

A note from the maker

Hey, I'm Seunghun 👋

In 2023 I left Spotify to work on the problem of finding the useful 90 seconds inside a three-hour podcast. We built goodlisten.co, ran out of runway, and I went back to a desk job.

But I kept needing it myself. English was the easy part. The audio I actually cared about was harder: Korean podcasts where the host slips into English, Japanese conversations with three speakers, Spanish lectures recorded in noisy rooms. I was tired of spending two hours just to find the two minutes that mattered.

So in 2025 I stopped trying to build for “the market” and built the tool I wished existed, for one very specific user: me. If it saves you time, tell me. If it doesn't, tell me directly. That's how it gets better.

Who it's for

Built for learners. With an API for builders.

Whether you're studying from a 3-hour lecture, a foreign-language podcast, or building an AI agent that needs to listen, the same engine handles it.

Students and lifelong learners

Turn every lecture, podcast, and video you study into a searchable library. Get cited answers tied to the exact second they were said.

Learners studying in any language

Korean MOOCs, Japanese podcasts, Spanish talks, or English lectures. We pick the right speech-to-text engine for each of 67 languages, with measured accuracy.

Developers building AI apps

One HTTP API plus a Claude and ChatGPT MCP surface. Plug the same engine into agents, automations, voice memo apps, and meeting bots.

No credit card required. Pay only for what you use.

Use it in your app

One HTTP API. Plus an MCP server for Claude and ChatGPT.

Same engine that powers your library. Chapters, library search, cited Q&A, subtitles, exports. Hit it from any agent, video tool, meeting bot, or voice app.

One curl, full pipeline

curl https://transcribe.so/api/v1/transcriptions \
  -H "Authorization: Bearer tsk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "source": "youtube",
    "url": "https://youtu.be/dQw4w9WgXcQ",
    "pipeline_code": "qwen3-asr-flash-filetrans"
  }'

YouTube, file upload via presigned S3, or any direct audio URL. Same response shape.

AI agents
Drop a transcript into your agent's context. Claude, ChatGPT, Cursor, anything that calls HTTP.
Video editors and tools
Word-level timestamps, burn-in captions, SRT/VTT export. Same engine as the dashboard.
Meeting bots and call platforms
Transcribe Zoom, Twilio, or any recording the moment a call ends. Webhook fires when ready.
Voice memos, podcasts, language apps
67 languages with measured accuracy per language. Auto-detect or pin a specific code per request.

Get an API key Read the docs

Use cases

From transcription to something actually useful

Whether you are publishing, editing, researching, or learning, transcribe.so helps you get usable output from long-form content faster.

Podcast and interview transcription

Search long conversations, find strong quotes, and jump straight to important moments with chapters, citations, and playback.

Subtitle creation for videos

Generate subtitles that are easier to use in your editing workflow, with more control than rough auto-captions.

Learning from YouTube and lectures

Turn long videos into structured content with chapters, cited answers, and searchable playback so you can study faster.

Meeting and recording review

Upload calls, notes, or voice recordings and quickly find decisions, highlights, and follow-up moments without re-listening to everything.

No credit card required. Pay only for what you use.

FAQ

Before you try transcribe.so

Start your library with one link.

Paste any lecture, podcast, or video. Free credits to start. See the exact cost before you confirm.

No credit card required. Pay only for what you use.

Keep scrolling for details

Product features in depth

What you get with every transcript

Every upload joins your searchable library. You get chapters, cited Q&A with timestamps, library-wide search, subtitles for any editor, and full exports. The right engine for your language is picked automatically.

In the box

Library-wide search across every transcript you own

Cited Q&A with timestamps. Click to jump to the second they said it

AI-generated chapters and section detection

AI summary and key takeaways

Speaker labels on multi-speaker audio

Entity extraction (people, places, brands)

67 languages with measured accuracy per language

Subtitle exports (SRT, WebVTT, karaoke VTT, JSON)

Encrypted Cloudflare R2 storage. Your audio is never used for training

Power-user toggle: override the default engine (GPT-4o, Qwen3-ASR-Flash, or Voxtral)

No credit card required. Pay only for what you use.

Subtitles

Subtitles ready for any editor

Every transcript ships with word-level timestamps, formatted as SRT or WebVTT for CapCut, Premiere Pro, DaVinci Resolve, or Final Cut Pro. Pick a platform preset or tune every parameter.

Platform Presets

One-click presets tuned for each platform's readability standards. Each preset controls characters per line, max lines, reading speed (CPS), timing gaps, and more.

YouTube

Long-form captions optimized for readability

20 CPS · 2 lines

TikTok / Shorts

Short, punchy single-line captions

20 CPS · 1 line

Netflix-style

Professional broadcast with strict reading speed

17 CPS · 2 lines

Podcast

Longer segments with speaker labels

15 CPS · 2 lines

Broadcast / TV

Traditional broadcast standards

15 CPS · 2 lines

Custom

Full control over every parameter

Export Formats

Export in the format your video editor needs. SRT and WebVTT import directly into CapCut, Premiere Pro, DaVinci Resolve, and Final Cut Pro.

SRT

CapCut, Premiere Pro, DaVinci Resolve, Final Cut Pro & more

WebVTT

Web players, CapCut, and editors with styling support

Karaoke VTT

Word-by-word highlight timing

JSON

Full data with word timestamps

Powered by Word-Level Timestamps

Unlike simple text-splitting tools, our subtitle engine uses precise word-level timestamps from your transcription to build optimally timed cues.

Line breaks chosen for readability, not character count

Smart line breaking at natural pauses

CPS-aware reading speed optimization

Automatic gap and duration enforcement

Speaker label support for multi-speaker content

Live preview before export

Privacy First

Your private files stay private

Worried about uploading sensitive audio? Privacy is built into the platform from the bottom up.

Encrypted Storage

Your files are stored in private Cloudflare R2 buckets with time-limited access links. Only you can view your transcriptions.

Instant Deletion

Delete anytime. All data is instantly removed from our servers. No backups, no retention, completely gone.

Trusted Infrastructure

Inference and embeddings via trusted enterprise providers (OpenAI, Mistral, and partners). Storage on Cloudflare R2. No other third parties involved.

Your Data, Your Control

We don't use your content for AI training. Your transcriptions are private and never shared or made public.

Questions about privacy? Contact us

Export & Share

Copy or export anything you read

Export anything in markdown. Chapters, search results, Q&A history all carry timestamps that link back to the source.

Table of Contents

Chapters

Search Results

Q&A History

One-click copy Markdown download Playable YouTube links Direct timestamps Time ranges

Turn lectures, podcasts, and videos into a searchable library. In any language.

Command Palette

Three steps from a recording to a searchable library

Drop in your audio

Get a structured transcript

Ask anything across your library

You remember the answer is in there. You just can't find it.

Build a knowledge library you can actually search

One searchable library across everything

Ask anything. Jump to the second they said it.

Works in any language, automatically

Under the hood: the speech-to-text engines we route to

Built for learners. With an API for builders.

Students and lifelong learners

Learners studying in any language

Developers building AI apps

One HTTP API. Plus an MCP server for Claude and ChatGPT.

From transcription to something actually useful

Podcast and interview transcription

Subtitle creation for videos

Learning from YouTube and lectures

Meeting and recording review

Before you try transcribe.so

Can I ask questions across my whole library?

Does it work for YouTube links, lectures, and podcasts?

How accurate is it for my language?

How does pricing work?

Is my audio private?

Can I cancel anytime?

Start your library with one link.

Product features in depth

What you get with every transcript

In the box

Subtitles ready for any editor

Platform Presets

Export Formats

Powered by Word-Level Timestamps

Your private files stay private

Encrypted Storage

Instant Deletion

Trusted Infrastructure

Your Data, Your Control

Copy or export anything you read