We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

🚀 New models by Bria.ai, generate and edit images at scale 🚀

Mistral Model Family

Developed by Mistral AI, a leading French research lab, Mistral is a family of open-source AI models built for multilingual excellence, advanced reasoning, and cost-effective performance. These models excel at complex reasoning, mathematics, coding, and specialized tasks while offering complete transparency and deployment freedom through open-source licensing.

Mistral Small 3.2 delivers breakthrough efficiency with native fluency in European languages, while specialized variants handle specific needs: Devstral for coding, Voxtral for audio processing, and Mixtral for high-performance tasks. With Apache 2.0 licensing, extensive context windows up to 128K tokens, and comprehensive customization options, Mistral provides enterprise-grade capabilities without vendor lock-in.

Perfect for building multilingual applications, coding assistants, and reasoning systems where you need both powerful performance and complete control over your AI deployment.

Featured Model: mistralai/Mistral-Small-3.2-24B-Instruct-2506

Mistral Small 3.2 delivers exceptional performance for its size — powerful multilingual reasoning, advanced coding capabilities, and efficient inference at a competitive price point, making it ideal for production applications that need reliable performance without premium costs.

Price per 1M input tokens

$0.075

Price per 1M output tokens

$0.20

Release Date

06/23/2025

Context Size

128,000

Quantization

fp8

# Assume openai>=1.0.0
from openai import OpenAI

# Create an OpenAI client with your deepinfra token and endpoint
openai = OpenAI(
    api_key="$DEEPINFRA_TOKEN",
    base_url="https://api.deepinfra.com/v1/openai",
)

chat_completion = openai.chat.completions.create(
    model="mistralai/Mistral-Small-3.2-24B-Instruct-2506",
    messages=[{"role": "user", "content": "Hello"}],
)

print(chat_completion.choices[0].message.content)
print(chat_completion.usage.prompt_tokens, chat_completion.usage.completion_tokens)

# Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?
# 11 25
copy

Available Mistral Models

DeepInfra provides access to Mistral AI's comprehensive open-source model ecosystem, from efficient small models to specialized coding and audio processing variants, all with complete Apache 2.0 licensing freedom.

Model	Context	$ per 1M input tokens	$ per 1M output tokens	Actions
Mistral-Small-3.2-24B-Instruct-2506	125k	$0.075	$0.20	View more
Mistral-Small-24B-Instruct-2501	32k	$0.05	$0.08	View more
Mistral-Nemo-Instruct-2407	128k	$0.02	$0.04	View more
Mixtral-8x7B-Instruct-v0.1	32k	$0.54	$0.54	View more

Available Voxtral Models

Voxtral is a family of audio models with state-of-the-art speech to text capabilities.

Model	$ per minute of audio input	Actions
Voxtral-Small-24B-2507	$0.00300	View more
Voxtral-Mini-3B-2507	$0.00100	View more

FAQ

What is Mistral AI?

Mistral AI is a leading French research lab that develops state-of-the-art LLMs with exceptional multilingual capabilities, advanced reasoning, and cost-effective performance.

Built on cutting-edge architecture, Mistral models specialize in complex multilingual reasoning tasks, mathematics, code generation, and domain-specific applications while maintaining complete transparency through open-source licensing. Available in multiple variants including efficient small models, specialized coding assistants, audio processing capabilities, and mixture-of-experts architectures, Mistral models are designed for developers and enterprises seeking powerful AI capabilities with deployment flexibility and cost efficiency.

What tasks are Mistral models best suited for?

Multilingual reasoning and communication with native fluency in European languages including French, German, Spanish, and Italian
Advanced mathematics and scientific reasoning with specialized models for STEM subjects and complex calculations
Code generation and software engineering with Devstral models optimized for programming tasks and repository analysis
Audio processing and transcription through Voxtral models with speech-to-text capabilities
Function calling and tool integration enabling connection with external APIs and databases
Structured output generation for JSON formatting and data extraction workflows
Fine-tuning and customization for domain-specific applications and enterprise use cases
Cost-effective inference for high-volume applications requiring efficient resource utilization
Reasoning and analysis with transparent step-by-step logical thinking through Magistral models

Are the Mistral models on Deepinfra optimized for low latency?

Yes. DeepInfra's infrastructure delivers optimized performance for Mistral models across all variants. Mistral's efficient architecture, especially the Small series, provides excellent performance-to-cost ratios with fast inference times. The platform supports streaming responses for real-time output, and Mistral's design philosophy emphasizes concise, focused responses that reduce token usage and improve response times.

What's the difference between Mistral, Mixtral, Devstral, and Voxtral?

Each model family serves different purposes within the Mistral ecosystem:

Mistral (base family): General-purpose language models optimized for multilingual reasoning, mathematics, and everyday tasks. Available in 7B and 24B parameter variants for different performance needs.

Mixtral: Mixture-of-experts architecture that activates different model components based on the task, providing specialized performance across various domains while maintaining efficiency.

Devstral: Coding-specialized models trained specifically for software engineering tasks including repository analysis, code completion, multi-file editing, and powering development agents.

Voxtral: Audio-enabled models that process both text and speech inputs, offering speech-to-text capabilities for multimodal applications requiring voice interaction.

Choose Mistral for general applications, Mixtral for diverse workloads, Devstral for coding projects, and Voxtral for audio processing needs.

What makes Mistral unique compared to other AI models?

Mistral's unique advantages center on:

Open-Source Excellence with complete Apache 2.0 licensing providing deployment freedom and transparency.
European AI Leadership brings native multilingual expertise, especially in European languages with cultural context understanding.
Domain Specialization includes Devstral for coding and Voxtral for audio processing.
Cost-Effective Architecture designed for efficiency without sacrificing capability.
Fine-Tuning Flexibility allows complete customization for specific use cases.
Research-Backed Development from leading French AI scientists focused on responsible AI advancement.

This combination makes Mistral ideal for organizations prioritizing cost efficiency, customization, and AI sovereignty.

How do I integrate Mistral models into my application?

You can integrate Mistral models seamlessly using DeepInfra’s OpenAI-compatible API. Just replace your existing base URL with DeepInfra’s endpoint and use your DeepInfra API key—no infrastructure setup required. DeepInfra also supports integration through libraries like openai, litellm, and other SDKs, making it easy to switch or scale your workloads instantly.