DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

To use DeepInfra's API, you'll need an API key.
You'll use this API key in your requests to authenticate with our services.
Whisper is a Speech-To-Text model from OpenAI. Given an audio file with voice data it produces human speech recognition text with per sentence timestamps. There are different model sizes (small, base, large, etc.) and variants for English, see more at deepinfra.com. By default, Whisper produces by sentence timestamp segmentation. We also host whisper-timestamped that can provide timestamps for words in the audio. You can use it with our REST API. Here's how to use it:
curl -X POST \
-F "audio=@/home/user/all-in-01.mp3" \
-H "Authorization: Bearer YOUR_API_KEY" \
'https://api.deepinfra.com/v1/inference/openai/whisper-timestamped-medium.en'
To see additional parameters and how to call this model, check out the documentation page for complete API reference and examples.
If you have any question, just reach out to us on our Discord server.
DeepSeek V3.2 API Benchmarks: Latency, Throughput & Cost<p>About DeepSeek V3.2 DeepSeek V3.2 is a state-of-the-art large language model that unifies conversational speed and deep reasoning in a single 685B parameter Mixture of Experts (MoE) architecture with 37B parameters activated per token. It is built around three key technical breakthroughs: DeepSeek V3.2 achieved gold-medal performance in the 2025 International Mathematical Olympiad (IMO) and […]</p>
GLM-5.1 API Benchmarks: Latency, Throughput & Cost<p>Z.ai’s GLM-5.1 is an April 2026 open-weight reasoning model built for long-horizon agentic engineering — and accessing it effectively means navigating a real spread of provider options. Across 10 benchmarked API providers, blended pricing ranges from $0.74 to $1.70 per 1M tokens, output speed from 33.8 to 175.2 t/s, and the fastest provider is 5.2x […]</p>
DeepSeek V4 Pro: Model Overview, Features & Performance Guide<p>DeepSeek V4 Pro is a 1.6-trillion parameter Mixture-of-Experts (MoE) model from DeepSeek, released on April 24, 2026 under the MIT license. It is designed for advanced reasoning, complex software engineering, and long-running agentic tasks, and arrives alongside DeepSeek-V4-Flash, a lighter 284B-parameter variant built for faster, lower-cost inference. The V4 series is DeepSeek’s first two-tier lineup […]</p>
© 2026 DeepInfra. All rights reserved.