Devstral is an agentic LLM for software engineering tasks, making it a great choice for software engineering agents.
Devstral is an agentic LLM for software engineering tasks, making it a great choice for software engineering agents.
Devstral-Small-2507
Ask me anything
Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this benchmark.
It is finetuned from Mistral-Small-3.1, therefore it has a long context window of up to 128k tokens. As a coding agent, Devstral is text-only and before fine-tuning from Mistral-Small-3.1
the vision encoder was removed.
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
Learn more about Devstral in our blog post.
Updates compared to Devstral Small 1.0
:
Devstral Small 1.1
is still great when paired with OpenHands. This new version also generalizes better to other prompts and coding environments.Devstral Small 1.1 achieves a score of 53.6% on SWE-Bench Verified, outperforming Devstral Small 1.0 by +6,8% and the second best state of the art model by +11.4%.
Model | Agentic Scaffold | SWE-Bench Verified (%) |
---|---|---|
Devstral Small 1.1 | OpenHands Scaffold | 53.6 |
Devstral Small 1.0 | OpenHands Scaffold | 46.8 |
GPT-4.1-mini | OpenAI Scaffold | 23.6 |
Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
SWE-smith-LM 32B | SWE-agent Scaffold | 40.2 |
Skywork SWE | OpenHands Scaffold | 38.0 |
DeepSWE | R2E-Gym Scaffold | 42.2 |
When evaluated under the same test scaffold (OpenHands, provided by All Hands AI 🙌), Devstral exceeds far larger models such as Deepseek-V3-0324 and Qwen3 232B-A22B.