Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Zhuolin Yang; Zihan Liu; Yang Chen; Wenliang Dai; Boxin Wang; Sheng-Chieh Lin; Chankyu Lee; Yangyi Chen; Dongfu Jiang; Jiafan He; Renjie Pi; Grace Lam; Nayeon Lee; Alexander Bukharin; Mohammad Shoeybi; Bryan Catanzaro; Wei Ping

arXiv:2603.19220·cs.CL·March 24, 2026

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Zhuolin Yang, Zihan Liu, Yang Chen, Wenliang Dai, Boxin Wang, Sheng-Chieh Lin, Chankyu Lee, Yangyi Chen, Dongfu Jiang, Jiafan He, Renjie Pi, Grace Lam, Nayeon Lee, Alexander Bukharin, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping

PDF

Open Access 10 Models

TL;DR

Nemotron-Cascade 2 is a compact open 30B MoE model that achieves top-tier reasoning and agentic skills, rivaling larger models in mathematical and coding tasks through innovative training and distillation techniques.

Contribution

The paper introduces multi-domain on-policy distillation and expanded Cascade RL, significantly enhancing reasoning capabilities of a compact LLM.

Findings

01

Achieves Gold Medal-level performance in IMO, IOI, and ICPC with 20x fewer parameters.

02

Approaches frontier open models in mathematical and coding reasoning.

03

Demonstrates high intelligence density in a compact model.

Abstract

We introduce Nemotron-Cascade 2, an open 30B MoE model with 3B activated parameters that delivers best-in-class reasoning and strong agentic capabilities. Despite its compact size, its mathematical and coding reasoning performance approaches that of frontier open models. It is the second open-weight LLM, after DeepSeekV3.2-Speciale-671B-A37B, to achieve Gold Medal-level performance in the 2025 International Mathematical Olympiad (IMO), the International Olympiad in Informatics (IOI), and the ICPC World Finals, demonstrating remarkably high intelligence density with 20x fewer parameters. In contrast to Nemotron-Cascade 1, the key technical advancements are as follows. After SFT on a meticulously curated dataset, we substantially expand Cascade RL to cover a much broader spectrum of reasoning and agentic domains. Furthermore, we introduce multi-domain on-policy distillation from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)