Qwen3-Max-Thinking state-of-the-art reasoning model at your fingertips!

Databricks Dolly is instruction tuned 12 billion parameter casual language model based on EleutherAI's pythia-12b. It was pretrained on The Pile, GPT-J's pretraining corpus. databricks-dolly-15k open source instruction following dataset was used to tune the model.
To get started, you'll need an API key from DeepInfra.
You can deploy the databricks/dolly-v2-12b model easily through the web dashboard or by using our API. The model will be automatically deployed when you first run an inference request.
You can use it with our REST API. Here's how to call the model using curl:
curl -X POST \
-d '{"prompt": "Who is Elvis Presley?"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer YOUR_API_KEY" \
'https://api.deepinfra.com/v1/inference/databricks/dolly-v2-12b'
We charge per inference request execution time, $0.0005 per second. Inference runs on Nvidia A100 cards. To see the full documentation of how to call this model, check out the model page on our website.
You can browse all available models on our models page.
If you have any question, just reach out to us on our Discord server.
Pricing 101: Token Math & Cost-Per-Completion Explained<p>LLM pricing can feel opaque until you translate it into a few simple numbers: input tokens, output tokens, and price per million. Every request you send—system prompt, chat history, RAG context, tool-call JSON—counts as input; everything the model writes back counts as output. Once you know those two counts, the cost of a completion is […]</p>
Compare Llama2 vs OpenAI models for FREE.At DeepInfra we host the best open source LLM models. We are always working hard to make
our APIs simple and easy to use.
Today we are excited to announce a very easy way to quickly try our models like
Llama2 70b and
[Mistral 7b](/mistralai/Mistral-7B-Instruc...
Build an OCR-Powered PDF Reader & Summarizer with DeepInfra (Kimi K2)<p>This guide walks you from zero to working: you’ll learn what OCR is (and why PDFs can be tricky), how to turn any PDF—including those with screenshots of tables—into text, and how to let an LLM do the heavy lifting to clean OCR noise, reconstruct tables, and summarize the document. We’ll use DeepInfra’s OpenAI-compatible API […]</p>
© 2026 Deep Infra. All rights reserved.