BGE-M3 is a versatile text embedding model that supports multi-functionality, multi-linguality, and multi-granularity, allowing it to perform dense retrieval, multi-vector retrieval, and sparse retrieval in over 100 languages and with input sizes up to 8192 tokens. The model can be used in a retrieval pipeline with hybrid retrieval and re-ranking to achieve higher accuracy and stronger generalization capabilities. BGE-M3 has shown state-of-the-art performance on several benchmarks, including MKQA, MLDR, and NarritiveQA, and can be used as a drop-in replacement for other embedding models like DPR and BGE-v1.5.
BGE-M3 is a versatile text embedding model that supports multi-functionality, multi-linguality, and multi-granularity, allowing it to perform dense retrieval, multi-vector retrieval, and sparse retrieval in over 100 languages and with input sizes up to 8192 tokens. The model can be used in a retrieval pipeline with hybrid retrieval and re-ranking to achieve higher accuracy and stronger generalization capabilities. BGE-M3 has shown state-of-the-art performance on several benchmarks, including MKQA, MLDR, and NarritiveQA, and can be used as a drop-in replacement for other embedding models like DPR and BGE-v1.5.
DeepInfra supports the OpenAI embeddings API. The following creates an embedding vector representing the input text
curl "https://api.deepinfra.com/v1/openai/embeddings" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "BAAI/bge-m3",
"encoding_format": "float"
}'
which will return something similar to
{
"object":"list",
"data":[
{
"object": "embedding",
"index":0,
"embedding":[
-0.010480394586920738,
-0.0026091758627444506
...
0.031979579478502274,
0.02021978422999382
]
}
],
"model": "BAAI/bge-m3",
"usage": {
"prompt_tokens":12,
"total_tokens":12
}
}
service_tier
stringThe service tier used for processing the request. When set to 'priority', the request will be processed with higher priority.
Allowed values: default
priority
dimensions
integerThe number of dimensions in the embedding. If not provided, the model's default will be used.If provided bigger than model's default, the embedding will be padded with zeros.
Range: 32 ≤ dimensions
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.