🚀 New models by Bria.ai, generate and edit images at scale 🚀
Developed by Mistral AI, a leading French research lab, Mistral is a family of open-source AI models built for multilingual excellence, advanced reasoning, and cost-effective performance. These models excel at complex reasoning, mathematics, coding, and specialized tasks while offering complete transparency and deployment freedom through open-source licensing.
Mistral Small 3.2 delivers breakthrough efficiency with native fluency in European languages, while specialized variants handle specific needs: Devstral for coding, Voxtral for audio processing, and Mixtral for high-performance tasks. With Apache 2.0 licensing, extensive context windows up to 128K tokens, and comprehensive customization options, Mistral provides enterprise-grade capabilities without vendor lock-in.
Perfect for building multilingual applications, coding assistants, and reasoning systems where you need both powerful performance and complete control over your AI deployment.
Mistral Small 3.2 delivers exceptional performance for its size — powerful multilingual reasoning, advanced coding capabilities, and efficient inference at a competitive price point, making it ideal for production applications that need reliable performance without premium costs.
Price per 1M input tokens
$0.075
Price per 1M output tokens
$0.20
Release Date
06/23/2025
Context Size
128,000
Quantization
fp8
# Assume openai>=1.0.0
from openai import OpenAI
# Create an OpenAI client with your deepinfra token and endpoint
openai = OpenAI(
api_key="$DEEPINFRA_TOKEN",
base_url="https://api.deepinfra.com/v1/openai",
)
chat_completion = openai.chat.completions.create(
model="mistralai/Mistral-Small-3.2-24B-Instruct-2506",
messages=[{"role": "user", "content": "Hello"}],
)
print(chat_completion.choices[0].message.content)
print(chat_completion.usage.prompt_tokens, chat_completion.usage.completion_tokens)
# Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?
# 11 25
DeepInfra provides access to Mistral AI's comprehensive open-source model ecosystem, from efficient small models to specialized coding and audio processing variants, all with complete Apache 2.0 licensing freedom.
Model | Context | $ per 1M input tokens | $ per 1M output tokens | Actions |
---|---|---|---|---|
Mistral-Small-3.2-24B-Instruct-2506 | 125k | $0.075 | $0.20 | |
Mistral-Small-3.1-24B-Instruct-2503 | 125k | $0.05 | $0.10 | |
Mistral-Small-24B-Instruct-2501 | 32k | $0.05 | $0.08 | |
Mistral-7B-Instruct-v0.3 | 32k | $0.028 | $0.054 | |
Mistral-Nemo-Instruct-2407 | 128k | $0.02 | $0.04 | |
Mixtral-8x7B-Instruct-v0.1 | 32k | $0.40 | $0.40 | |
Devstral-Small-2507 | 125k | $0.07 | $0.28 |
Voxtral is a family of audio models with state-of-the-art speech to text capabilities.
Model | $ per minute of audio input | Actions |
---|---|---|
Voxtral-Small-24B-2507 | $0.00300 | |
Voxtral-Mini-3B-2507 | $0.00100 |
Mistral AI is a leading French research lab that develops state-of-the-art LLMs with exceptional multilingual capabilities, advanced reasoning, and cost-effective performance.
Built on cutting-edge architecture, Mistral models specialize in complex multilingual reasoning tasks, mathematics, code generation, and domain-specific applications while maintaining complete transparency through open-source licensing. Available in multiple variants including efficient small models, specialized coding assistants, audio processing capabilities, and mixture-of-experts architectures, Mistral models are designed for developers and enterprises seeking powerful AI capabilities with deployment flexibility and cost efficiency.
Yes. DeepInfra's infrastructure delivers optimized performance for Mistral models across all variants. Mistral's efficient architecture, especially the Small series, provides excellent performance-to-cost ratios with fast inference times. The platform supports streaming responses for real-time output, and Mistral's design philosophy emphasizes concise, focused responses that reduce token usage and improve response times.
Each model family serves different purposes within the Mistral ecosystem:
Mistral (base family): General-purpose language models optimized for multilingual reasoning, mathematics, and everyday tasks. Available in 7B and 24B parameter variants for different performance needs.
Mixtral: Mixture-of-experts architecture that activates different model components based on the task, providing specialized performance across various domains while maintaining efficiency.
Devstral: Coding-specialized models trained specifically for software engineering tasks including repository analysis, code completion, multi-file editing, and powering development agents.
Voxtral: Audio-enabled models that process both text and speech inputs, offering speech-to-text capabilities for multimodal applications requiring voice interaction.
Choose Mistral for general applications, Mixtral for diverse workloads, Devstral for coding projects, and Voxtral for audio processing needs.
Mistral's unique advantages center on:
This combination makes Mistral ideal for organizations prioritizing cost efficiency, customization, and AI sovereignty.
© 2025 Deep Infra. All rights reserved.