We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

🚀 New model available: DeepSeek-V3.1 🚀

Mistral Model Family

Developed by Mistral AI, a leading French research lab, Mistral is a family of open-source AI models built for multilingual excellence, advanced reasoning, and cost-effective performance. These models excel at complex reasoning, mathematics, coding, and specialized tasks while offering complete transparency and deployment freedom through open-source licensing.

Mistral Small 3.2 delivers breakthrough efficiency with native fluency in European languages, while specialized variants handle specific needs: Devstral for coding, Voxtral for audio processing, and Mixtral for high-performance tasks. With Apache 2.0 licensing, extensive context windows up to 128K tokens, and comprehensive customization options, Mistral provides enterprise-grade capabilities without vendor lock-in.

Perfect for building multilingual applications, coding assistants, and reasoning systems where you need both powerful performance and complete control over your AI deployment.

Featured Model: mistralai/Mistral-Small-3.2-24B-Instruct-2506

Mistral-Small-3.2-24B-Instruct is a drop-in upgrade over the 3.1 release, with markedly better instruction following, roughly half the infinite-generation errors, and a more robust function-calling interface—while otherwise matching or slightly improving on all previous text and vision benchmarks.

Price per 1M input tokens

$0.05


Price per 1M output tokens

$0.10


Release Date

06/23/2025


Context Size

128,000


Quantization

fp8


# Assume openai>=1.0.0
from openai import OpenAI

# Create an OpenAI client with your deepinfra token and endpoint
openai = OpenAI(
    api_key="$DEEPINFRA_TOKEN",
    base_url="https://api.deepinfra.com/v1/openai",
)

chat_completion = openai.chat.completions.create(
    model="mistralai/Mistral-Small-3.2-24B-Instruct-2506",
    messages=[{"role": "user", "content": "Hello"}],
)

print(chat_completion.choices[0].message.content)
print(chat_completion.usage.prompt_tokens, chat_completion.usage.completion_tokens)

# Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?
# 11 25
copy

Available Mistral Models

Developed by Mistral AI, a leading French research lab, Mistral is a family of open-source AI models built for multilingual excellence, advanced reasoning, and cost-effective performance

Available Voxtral Models

Voxtral is a family of audio models with state-of-the-art speech to text capabilities.

Model$ per minute of audio input
Actions
Voxtral-Small-24B-2507$0.00300
Voxtral-Mini-3B-2507$0.00100

FAQ

What is Mistral AI?

Mistral AI is a leading French research lab that develops state-of-the-art LLMs with exceptional multilingual capabilities, advanced reasoning, and cost-effective performance. Built on cutting-edge architecture, Mistral models specialize in complex multilingual reasoning tasks, mathematics, code generation, and domain-specific applications while maintaining complete transparency through open-source licensing. Available in multiple variants including efficient small models, specialized coding assistants, audio processing capabilities, and mixture-of-experts architectures, Mistral models are designed for developers and enterprises seeking powerful AI capabilities with deployment flexibility and cost efficiency.

Are the Mistral models on Deepinfra optimized for low latency?

Yes. DeepInfra's infrastructure delivers optimized performance for Mistral models across all variants. Mistral's efficient architecture, especially the Small series, provides excellent performance-to-cost ratios with fast inference times. The platform supports streaming responses for real-time output, and Mistral's design philosophy emphasizes concise, focused responses that reduce token usage and improve response times.

What's the difference between Mistral, Mixtral, Devstral, and Voxtral?

Each model family serves different purposes within the Mistral ecosystem: Mistral (base family): General-purpose language models optimized for multilingual reasoning, mathematics, and everyday tasks. Available in 7B and 24B parameter variants for different performance needs. Mixtral: Mixture-of-experts architecture that activates different model components based on the task, providing specialized performance across various domains while maintaining efficiency. Devstral: Coding-specialized models trained specifically for software engineering tasks including repository analysis, code completion, multi-file editing, and powering development agents. Voxtral: Audio-enabled models that process both text and speech inputs, offering speech-to-text capabilities for multimodal applications requiring voice interaction. Choose Mistral for general applications, Mixtral for diverse workloads, Devstral for coding projects, and Voxtral for audio processing needs.

How do I integrate Mistral models into my application?

You can integrate Mistral models seamlessly using DeepInfra’s OpenAI-compatible API. Just replace your existing base URL with DeepInfra’s endpoint and use your DeepInfra API key—no infrastructure setup required. DeepInfra also supports integration through libraries like openai, litellm, and other SDKs, making it easy to switch or scale your workloads instantly.

What are the pricing details for using Mistral models on DeepInfra?

Pricing is usage-based:
  • Input Tokens: between $0.02 and $0.08 per million
  • Output Tokens: between $0.04 and $0.28 per million
Prices vary slightly by model. There are no upfront fees, and you only pay for what you use.

How do I get started using Mistral on DeepInfra?

Sign in with GitHub at deepinfra.com
  • Get your API key
  • Test models directly from the browser, cURL, or SDKs
  • Review pricing on your usage dashboard
Within minutes, you can deploy apps using Mistral models—without any infrastructure setup.