Skip to main content
VidFlow
 Sign inStart free50 cr
TRY
Quickmode
Director
Showcase
script.fountain
final.mp4
why?
WhyVidFlow
from $0.017
PER CR
Pricing
Diaries
Changelog
Docs
?
FAQ
⛑
Help
docs/production.md — VidFlow
DOCS · PRODUCTION

Production.

Shot planning, parallel generation, review, render, ship.

Production is where the script becomes a video. The shot planner cuts the timeline into shots, each shot becomes an image then a video clip (or a Ken Burns animated still for slower beats), and the final render stitches everything together. This is the most expensive stage and the one most likely to surface failures, so it's worth understanding the order.

Shot plan first. The planner is deterministic — not LLM-based. It walks the word-level timestamps from Alignment and greedy-windows them into shots, picking boundary points on sentence terminators and pause gaps. Each shot is then tagged with a semantic beat (HOOK, CLAIM, EXAMPLE, CONTRAST, EVIDENCE, TRANSITION, CONCLUSION) based on the narration text, plus an assigned location and zero-or-more character references.

Mode picking. The planner picks a generation mode per shot: VIDEO (full clip), ANIMATED_STILL (image with FFmpeg Ken Burns motion), STILL (static image), or INFOGRAPHIC_VIDEO. Your video style and timing constraints drive the mix. REALISM long-form gets mostly VIDEO with some ANIMATED_STILL for slower beats. Hybrid mode caps video at the first N minutes and switches to animated stills past that point — cheaper, faster to render. The Shot plan card shows the distribution: total shots, video shots, animated still, infographic.

Plan validation. The card surfaces issues if the plan has timeline gaps or duration constraint violations. Most warnings are non-blocking; a hard error means the planner couldn't fit a particular span without breaking the constraints. Re-plan (force) re-runs the planner from scratch, useful when you've edited the script and want a fresh cut.

Generation, image-first. Each shot generates an image first (KIE nano-banana-2 with seedream/gpt-image fallback on content-policy rejects), then a video clip from that image (KIE Kling 2.6 by default, Veo3 available). The image-before-video order is a hard precondition the generator enforces. Each shot's image references the location and character assets from your Visual Bible — that's how the protagonist in shot 4 looks like the protagonist in shot 47.

Bulk generation, in parallel. Click 'Generate all' and the production runner fans shots out across an in-browser pool — up to 8 in flight at once by default, tuned to stay under vendor rate limits. As each completes, the next slot fills. You watch progress in real time; the per-shot card flips PLANNED → IMAGE READY → VIDEO READY as it lands. (There's also an opt-in BullMQ worker path, off by default, that runs the same work in a separate process.)

Failures. Each shot's outer catch saves the error to shot.lastError and the Production card surfaces it inline under the FAILED badge. Common causes: a content-policy reject (the fallback chain rotates through three image providers automatically), an unreachable reference URL, a vendor timeout. Click Generate on the failed shot to retry. Credits for shots that failed before producing any usable artifact are refunded automatically.

Music & thumbnails, parallel. The Music card and the Thumbnails card both run independently of shot generation. Three thumbnail variants generate in parallel; you pick the one you like. None of them gate the render — you can ship without music, and you can pick a thumbnail later.

Render. Once shots are approved, render stitches them together with the voiceover, the music (if any), and any text overlays. FFmpeg locally is the default render path; renders usually take a few minutes for a 3-7 minute video. The output is an .mp4 written to R2 storage. The Watch button plays it inline; the Download button streams it through our same-origin proxy so the browser actually saves the file instead of opening the URL.

Publish. Once rendered, the ship-to-YouTube modal handles the upload. It pulls the channel, title, description (with optional ✦ Suggest from script — and chapter timestamps are appended server-side in YouTube's M:SS Title format so the block renders as clickable links), tags (also ✦ Suggest-able), thumbnail, privacy, and audience.

That's the production loop. Pick a plan, generate, review, render, ship.

MISSING SOMETHING? FAQ · BLOG← back to docs index
VidFlow

The directable video pipeline. From idea to ship — every seam, yours.

EST. 2024 · LISBON · LOS ANGELES
LEGAL
PrivacyTermsCookiesStatus
© 2026 VidFlow. All rights reserved.