Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence
NVIDIA: Amala Sanjay Deshmukh, Kateryna Chumachenko, Tuomas Rintamaki, Matthieu Le, Tyler Poon, Danial Mohseni Taheri, Ilia Karmanov, Guilin Liu, Jarno Seppanen, Arushi Goel, Mike Ranzinger, Greg Heinrich, Guo Chen, Lukas Voegtle, Philipp Fischer, Timo Roman, Karan Sapra

TL;DR
Nemotron 3 Nano Omni is a multimodal AI model supporting audio, text, images, and video, with improved accuracy, efficiency, and open access to checkpoints and training data.
Contribution
It introduces a new multimodal model with native audio support, enhanced accuracy, and innovative token-reduction techniques for lower latency and higher throughput.
Findings
Achieves leading results in document understanding, audio-video comprehension, and agentic computer use.
Delivers lower inference latency and higher throughput than comparable models.
Provides open access to model checkpoints, training data, and code for research.
Abstract
We introduce Nemotron 3 Nano Omni, the latest model in the Nemotron multimodal series and the first to natively support audio inputs alongside text, images, and video. Nemotron 3 Nano Omni delivers consistent accuracy improvements over its predecessor, Nemotron Nano V2 VL, across all modalities, enabled by advances in architecture, training data and recipes. In particular, Nemotron 3 delivers leading results in real-world document understanding, long audio-video comprehension, and agentic computer use. Built on the highly efficient Nemotron 3 Nano 30B-A3B backbone, Nemotron 3 Nano Omni further incorporates innovative multimodal token-reduction techniques to deliver substantially lower inference latency and higher throughput than other models of similar size. We are releasing model checkpoints in BF16, FP8, and FP4 formats, along with portions of the training data and codebase to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 391k dl· ♡ 310391k dl♡ 310
- 🤗nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4model· 1.3M dl· ♡ 1161.3M dl♡ 116
- 🤗nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8model· 170k dl· ♡ 48170k dl♡ 48
- 🤗touchmboweni/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 22 dl22 dl
- 🤗god-yhw/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4model· 20 dl· ♡ 120 dl♡ 1
- 🤗servantofares/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 22 dl22 dl
- 🤗Jashan887/76_Nvidia_Reasoning_30Bmodel· 16 dl16 dl
- 🤗Ihckhfuffhlays/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 18 dl18 dl
- 🤗Kris071/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 17 dl17 dl
- 🤗nvidia-ai/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16model· 163k dl163k dl
Videos
NVIDIA New AI Is An Efficiency Monster· youtube
