Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.
Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.
You can use cURL or any other http client to run inferences:
curl -X POST \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-F image=@my_image.jpg \
-F 'question=Explain this image.' \
'https://api.deepinfra.com/v1/inference/deepseek-ai/Janus-Pro-7B'
which will give you back something similar to:
{
"response": "A photo of an astronaut riding a horse on Mars.",
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0
}
}
top_p
numberTop-p sampling parameter, higher values increase diversity
Default value: 0.95
Range: 0 ≤ top_p ≤ 1
temperature
numberTemperature parameter, higher values increase randomness
Default value: 0.1
Range: 0 ≤ temperature ≤ 1
webhook
fileThe webhook to call when inference is done, by default you will get the output in the response of your inference request