🚀 New model available: DeepSeek-V3.1 🚀
All categories and models you can try out and directly use in deepinfra:
View All
featured
text-generation
automatic-speech-recognition
text-to-speech
embeddings
text-to-video
text-to-image
reranker
zero-shot-image-classification
multimodal
show replaced
ByteDance/
SeeDance-T2V
Seedance 1.0 by ByteDance is a high-performance AI video foundation model that generates 1080p multi‑shot clips from both text and image prompts—delivering cinematic motion, structural consistency across scenes, and precise adherence to your instructions
$1.200 / Mtoken
Wan-AI/
Wan2.1-T2V-1.3B
The Wan2.1 1.3B model is a lightweight, efficient text-to-video generator. Despite its compact size, it delivers impressive performance across benchmarks and generates high-quality 480P videos.
$0.10 / video
Wan2.1-T2V-14B
The Wan2.1 14B model is a high-capacity, state-of-the-art video foundation model capable of producing both 480P and 720P videos. It excels at capturing complex prompts and generating visually rich, detailed scenes, making it ideal for high-end creative tasks.
$0.40 / video
google/
veo-3.0
Veo 3 is a state-of-the-art text-to-video model from Google that generates high-fidelity, cinematic videos with synchronized audio from a simple text prompt. It excels at creating realistic and imaginative scenes with a deep understanding of natural language and visual dynamics.
$0.8 / sec
veo-3.0-fast
Veo 3 Fast is a speed-optimized version of the Veo 3 model, designed for rapid video creation. While maintaining high quality, it delivers results in a fraction of the time, making it ideal for quick iterations and dynamic content generation.
$0.5 / sec
Company
Latest Models
Featured Models
© 2025 Deep Infra. All rights reserved.