PRODUCT5 min · TUE · JUN 09 · 2026

One voice provider, plus yours

We cut three TTS providers down to one and added per-account voice cloning. What changed, what the cap is, and how to pick.

The VidFlow TeamNotes from the people building the pipeline

An earlier version of this post was a picking guide between three TTS providers. That lineup is gone. There is one voice engine in the VidFlow code now — and one new capability that matters more than a provider menu ever did: cloning your own voice.

Why we cut to one

Three providers meant three auth paths, three failure modes, and three catalogs that drifted independently — plus silent cross-provider fallbacks that swapped your narrator mid-run when one vendor hiccuped. The honest observation from usage: most creators picked a voice once and never opened the menu again. One deeply wired vendor beats three half-wired ones, so we kept the one with the best clone path and deleted the rest.

The library

Voices are fetched live from our catalog — searchable, with a play-preview on every voice, varying in gender and tonal character. We don't curate a count; the catalog is ours and it shifts as voices are added, so any "N+ voices" headline would be borrowed. Preview a few, pick one, pin it as the channel default if you want every project under that channel to open with it.

Cloning, three ways

Upload a short audio sample (5–15 seconds, wav/mp3/webm), record one directly in the browser, or describe the voice you want in words and have one designed. Each route produces a custom voice tied to your account that shows up in the same selector as the library and works in any of your projects. The cap is 5 custom voices per account — delete one to make room for another. Cloning itself is free within the cap; the cap is the cost control.

Consent, explicitly

A voice sample is personal data. The clone endpoint refuses to run without your explicit consent flag, and the sample clip is only stored after the clone succeeds — a failed clone never leaves an orphaned recording on our storage. We don't train models on your samples; a cloned voice is scoped to your account, not a fine-tune.

Multi-voice still works

MultiVoice mode assigns a different voice per character — the narrator stays consistent across the project and named characters get their own voices, cloned or stock.

What happened to "bring your own voice ID"?

Gone with the extra providers. The old escape hatch — pasting a voice ID from a third-party account — only existed because we couldn't clone. Now you clone the voice you actually want instead of proxying someone else's catalog.

Picking guide, short version

- Your own voice on your own channel → record 15 seconds and clone it. - A specific character sound you can describe but can't record → voice design, in words. - Just need a solid narrator → preview the library, pin the winner as the channel default.

One provider, plus yours. That's the whole voice story now.

See the pipeline work on your idea.

350 credits free, no card. Direct all five stages start to ship.

Start free →

KEEP READING

ENGINEERING

Why our captions broke in production

ENGINEERING

Moving shot generation onto a queue

PRODUCT5 min · TUE · JUN 09 · 2026

One voice provider, plus yours

We cut three TTS providers down to one and added per-account voice cloning. What changed, what the cap is, and how to pick.

The VidFlow TeamNotes from the people building the pipeline

Why we cut to one

The library

Cloning, three ways

Consent, explicitly

Multi-voice still works

MultiVoice mode assigns a different voice per character — the narrator stays consistent across the project and named characters get their own voices, cloned or stock.

What happened to "bring your own voice ID"?

Picking guide, short version

One provider, plus yours. That's the whole voice story now.

See the pipeline work on your idea.

350 credits free, no card. Direct all five stages start to ship.

Start free →

KEEP READING

ENGINEERING

Why our captions broke in production

ENGINEERING

Moving shot generation onto a queue