NVIDIA Nemotron 3 Super - blazing-fast agentic AI, ready to deploy today!
nvidia/
$0.05
in
$0.20
out
/ 1M tokens
NVIDIA Nemotron 3 Nano is an open small reasoning model optimized for fast, cost-efficient inference in agentic and production workloads. Built with a hybrid Mixture-of-Experts (MoE) and Mamba-Transformer architecture, it delivers strong multi-step reasoning, high token throughput, stable latency with predictable cost, and efficient deployment for agent-based systems. Designed for real-world AI systems where reasoning can generate significantly more tokens per prompt, Nemotron Nano reduces compute cost while maintaining strong reasoning quality.

Ask me anything
Settings
© 2026 Deep Infra. All rights reserved.