OpenCodeReasoning-II: A Simple Test Time Scaling Approach via Self-Critique
Wasi Uddin Ahmad, Somshubra Majumdar, Aleksander Ficek, Sean Narenthiran, Mehrzad Samadi, Jocelyn Huang, Siddhartha Jain, Vahid Noroozi, Boris Ginsburg

TL;DR
This paper introduces OpenCodeReasoning-II, a large dataset for code reasoning, and a two-stage fine-tuning approach for LLMs that improves code generation and critique, enhancing performance on coding benchmarks.
Contribution
The paper presents a new large-scale dataset and a novel two-stage fine-tuning method for LLMs to improve code reasoning and critique capabilities.
Findings
Achieved state-of-the-art performance in code generation.
Significant improvements in competitive coding tasks.
Extended benchmark support for C++ language.
Abstract
Recent advancements in reasoning-based Large Language Models (LLMs), particularly their potential through test-time scaling, have created significant opportunities for distillation in code generation and critique. However, progress in both areas fundamentally depends on large-scale, high-quality datasets. In this work, we introduce OpenCodeReasoning-II, a dataset consists of 2.5M question-solution-critique triples (approx. 35K unique programming questions), making it nearly twice the size of the previous largest publicly available code reasoning dataset. In this work, we employ a two-stage supervised fine-tuning strategy. The first stage focuses on fine-tuning for code generation, while the second stage involves the joint training of models for both code generation and critique. Our resulting finetuned Qwen2.5-Instruct models achieve performance in code generation that either exceeds or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗nvidia/OpenReasoning-Nemotron-32Bmodel· 112k dl· ♡ 121112k dl♡ 121
- 🤗nvidia/OpenReasoning-Nemotron-14Bmodel· 4.5k dl· ♡ 434.5k dl♡ 43
- 🤗nvidia/OpenReasoning-Nemotron-7Bmodel· 1.1k dl· ♡ 491.1k dl♡ 49
- 🤗nvidia/OpenReasoning-Nemotron-1.5Bmodel· 13k dl· ♡ 5413k dl♡ 54
- 🤗gabriellarson/OpenReasoning-Nemotron-1.5B-GGUFmodel· 60 dl60 dl
- 🤗gabriellarson/OpenReasoning-Nemotron-7B-GGUFmodel· 97 dl97 dl
- 🤗gabriellarson/OpenReasoning-Nemotron-14B-GGUFmodel· 28 dl· ♡ 228 dl♡ 2
- 🤗gabriellarson/OpenReasoning-Nemotron-32B-GGUFmodel· 55 dl· ♡ 155 dl♡ 1
- 🤗unsloth/OpenReasoning-Nemotron-32Bmodel· 19 dl· ♡ 219 dl♡ 2
- 🤗unsloth/OpenReasoning-Nemotron-32B-GGUFmodel· 194 dl· ♡ 11194 dl♡ 11
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Model-Driven Software Engineering Techniques · Real-time simulation and control systems
