We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

FLUX.2 is live! High-fidelity image generation made simple.

Deploy Custom LLMs on DeepInfra
Published on 2024.03.01 by Iskren Chernev
Deploy Custom LLMs on DeepInfra

Did you just finetune your favorite model and are wondering where to run it? Well, we have you covered. Simple API and predictable pricing.

Put your model on huggingface

Use a private repo, if you wish, we don't mind. Create a hf access token just for the repo for better security.

Create custom deployment

Via Web

You can use the Web UI to create a new deployment.

Custom LLM Web UI

Via HTTP

We also offer HTTP API:

curl -X POST https://api.deepinfra.com/deploy/llm -d '{
    "model_name": "test-model",
    "gpu": "A100-80GB",
    "num_gpus": 2,
    "max_batch_size": 64,
    "hf": {
        "repo": "meta-llama/Llama-2-7b-chat-hf"
    },
    "settings": {
        "min_instances": 1,
        "max_instances": 1,
    }
}' -H 'Content-Type: application/json' \
    -H "Authorization: Bearer YOUR_API_KEY"
copy

Use it

curl -X POST \
    -d '{"input": "Hello"}' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer YOUR_API_KEY" \
    'https://api.deepinfra.com/v1/inference/github-username/di-model-name'
copy

For in depth tutorial check Custom LLM Docs.

Related articles
FLUX.1-dev Guide: Mastering Text-to-Image AI Prompts for Stunning and Consistent VisualsFLUX.1-dev Guide: Mastering Text-to-Image AI Prompts for Stunning and Consistent VisualsLearn how to craft compelling prompts for FLUX.1-dev to create stunning images.
Fork of Text Generation Inference.Fork of Text Generation Inference.The text generation inference open source project by huggingface looked like a promising framework for serving large language models (LLM). However, huggingface announced that they will change the license of code with version v1.0.0. While the previous license Apache 2.0 was permissive, the new on...