Volcengine

Bytedance/seedance-2-0

From $3.87/ call

Seedance 2.0 is a multimodal controllable video generation model developed by ByteDance’s Seed Team. Launched in early February 2026, it is now integrated into Doubao, Jimeng AI, and Volcano Engine (Model ID: doubao-seedance-2-0-260128), with an accelerated version—Seedance 2.0 Fast—available for low-latency scenarios.

Text to VideoImage to Video

More from Bytedance

README

Bytedance/seedance-2-0

What's coming next: Seedance 2.5 launches early July 2026 — see what's new → Seedance 2.0 is a multimodal controllable video generation model developed by ByteDance’s Seed Team. Launched in early February 2026, it is now integrated into Doubao, Jimeng AI, and Volcano Engine (Model ID: doubao-seedance-2-0-260128), with an accelerated version—Seedance 2.0 Fast—available for low-latency scenarios. At its core is a unified multimodal audio-visual joint generation architecture, natively supporting four input modalities: text, images, audio, and video. It can accept up to 3 video clips, 9 reference images, and 3 audio segments in a single generation, outputting 4–15 second, 480p/720p cinematic multi-shot videos with native audio—no post-production required. Key Capabilities Multi-Shot Storytelling: A single prompt can generate a coherent sequence of multiple shots, automatically maintaining character, environment, and narrative consistency for a polished, edited look. Director-Level Camera Control: Precisely execute cinematic movements like dolly zooms, tracking shots, and Hitchcock zooms, plus first-person perspectives and slow-motion close-ups. Real-World Physics: Deeply understands physical laws, generating realistic collisions, deformations, and dynamics for action sequences, chases, and explosions. Native Audio-Video Sync: Jointly generates audio and video with precise lip-sync for 8+ languages, delivering music, dialogue, and ambient sounds in one pass—no post-synthesis needed. Efficient Generation Architecture: Built on the Flow Matching framework (instead of traditional diffusion models), it boosts generation speed by 30% while maintaining high quality. Its performance is validated by the internal benchmark SeedVideoBench-2.0, leading in motion realism, audio quality, and other metrics. Use Cases Ideal for short drama production, e-commerce ads, AI comics, pre-shooting, and game animation, it significantly reduces costs and entry barriers while improving production efficiency.

Pricing

ResolutionvideosToken TypeLinkAI PriceOfficial Price
1080Pfalseoutput$6.930000 / 1M tokens$7.700000 / 1M tokens
1080Ptrueoutput$4.230000 / 1M tokens$4.700000 / 1M tokens
480Pfalseoutput$6.300000 / 1M tokens$7.000000 / 1M tokens
480Ptrueoutput$3.870000 / 1M tokens$4.300000 / 1M tokens
720Pfalseoutput$6.300000 / 1M tokens$7.000000 / 1M tokens
720Ptrueoutput$3.870000 / 1M tokens$4.300000 / 1M tokens