DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

DeepInfra is serving NVIDIA Cosmos 3, NVIDIA's open world foundation model for physical AI, from day zero of its release. As the first omnimodel for physical AI that reasons before it generates, Cosmos 3 is live on DeepInfra today as two variants—Cosmos 3 Nano and Cosmos 3 Super—at the industry's best prices, empowering developers to build physical AI systems without compromising on budget or performance.
Most generative models just generate. Cosmos 3 does something different: it reasons first, then generates. That distinction matters a great deal if you're building physical AI systems like robots or autonomous vehicles, where generating plausible-but-wrong outputs isn't just a quality issue—it's a safety one. As NVIDIA describes it, Cosmos 3 is the first OmniModel that unifies reasoning, world, and action generation in a single architecture.
Under the hood it uses a Mixture-of-Transformer architecture that combines an autoregressive reasoner with a diffusion-based generator. Inputs and outputs span text, image, video, audio, and action, making Cosmos 3 genuinely multimodal in both directions—not just for perception, but for generation and decision-making as well.
Ranked #1 open world generation model for synthetic data generation. Use it to generate training data for physical AI at scale, without expensive real-world data collection.
Ranked #1 backbone for world action models. A strong foundation for robotics, embodied AI, and AV policy training.
Ranked #1 open model for visual understanding on fixed infrastructure cameras—useful for smart city, warehouse, logistics deployments, infrastructure monitoring, and industrial automation.
Designed for closed-loop learning and simulation workflows. Pairs with NVIDIA AV Sim and Isaac Sim for training, testing, and evaluating physical AI systems in simulated environments before deployment.
The lighter variant. A good starting point for experimentation, fine-tuning, and latency-sensitive workloads.
The full-capability variant. Tops the PAI Bench and R-Bench leaderboards. Use it where quality and reasoning performance are the priority.
Both are available on DeepInfra today via our standard API—the same setup as any other model, with no special configuration needed to get started.
Cosmos 3 Nano and Cosmos 3 Super are live on DeepInfra now. If you're building physical AI, robots, or AV systems and want to experiment with world modeling, reasoning, action generation, and synthetic data creation, this is a strong place to start.
Visit our models page to explore competitive rates for Cosmos 3 inference, or check out the DeepInfra docs to learn more about our complete model ecosystem and developer resources.
Enhancing Open-Source LLMs with Function Calling FeatureWe're excited to announce that the Function Calling feature is now available on DeepInfra. We're offering Mistral-7B and Mixtral-8x7B models with this feature. Other models will be available soon.
LLM models are powerful tools for various tasks. However, they're limited in their ability to per...
Building Efficient AI Inference on NVIDIA Blackwell PlatformDeepInfra delivers up to 20x cost reductions on NVIDIA Blackwell by combining MoE architectures, NVFP4 quantization, and inference optimizations — with a Latitude case study.© 2026 DeepInfra. All rights reserved.