Qwen3-Max-Thinking state-of-the-art reasoning model at your fingertips!

You can browse available text-to-image models on the models page.
For example, we'll use runwayml/stable-diffusion-v1-5.
curl -X POST \
-d '{"prompt": "A photo of a cube floating in space"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer YOUR_API_KEY" \
-o cube.jpg \
'https://api.deepinfra.com/v1/inference/runwayml/stable-diffusion-v1-5'
And check out the output in cube.jpg.
You can check all the available settings on the model page or via the API documentation tab.
Deploy Custom LLMs on DeepInfraDid you just finetune your favorite model and are wondering where to run it?
Well, we have you covered. Simple API and predictable pricing.
Put your model on huggingface
Use a private repo, if you wish, we don't mind. Create a hf access token just
for the repo for better security.
Create c...
GLM-4.6 vs DeepSeek-V3.2: Performance, Benchmarks & DeepInfra Results<p>The open-source LLM ecosystem has evolved rapidly, and two models stand out as leaders in capability, efficiency, and practical usability: GLM-4.6, Zhipu AI’s high-capacity reasoning model with a 200k-token context window, and DeepSeek-V3.2, a sparsely activated Mixture-of-Experts architecture engineered for exceptional performance per dollar. Both models are powerful. Both are versatile. Both are widely adopted […]</p>
From Precision to Quantization: A Practical Guide to Faster, Cheaper LLMs<p>Large language models live and die by numbers—literally trillions of them. How finely we store those numbers (their precision) determines how much memory a model needs, how fast it runs, and sometimes how good its answers are. This article walks from the basics to the deep end: we’ll start with how computers even store a […]</p>
© 2026 Deep Infra. All rights reserved.