NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

Weizhe Yuan; Jane Yu; Song Jiang; Karthik Padthe; Yang Li; Ilia Kulikov; Kyunghyun Cho; Dong Wang; Yuandong Tian; Jason E Weston; Xian Li

arXiv:2502.13124·cs.CL·November 10, 2025

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Ilia Kulikov, Kyunghyun Cho, Dong Wang, Yuandong Tian, Jason E Weston, Xian Li

PDF

Open Access 5 Datasets 1 Video

TL;DR

NaturalReasoning introduces a large, diverse dataset of 2.8 million reasoning questions across multiple domains, enabling improved reasoning capabilities and self-training for AI models.

Contribution

It presents a scalable method for generating high-quality reasoning questions and releases a comprehensive dataset for advancing reasoning research.

Findings

01

Effective knowledge distillation from strong teacher models.

02

Successful unsupervised self-training using external reward models.

03

Demonstrated diversity and challenge level of the dataset.

Abstract

Scaling reasoning capabilities beyond traditional domains such as math and coding is hindered by the lack of diverse and high-quality questions. To overcome this limitation, we introduce a scalable approach for generating diverse and challenging reasoning questions, accompanied by reference answers. We present NaturalReasoning, a comprehensive dataset comprising 2.8 million questions that span multiple domains, including STEM fields (e.g., Physics, Computer Science), Economics, Social Sciences, and more. We demonstrate the utility of the questions in NaturalReasoning through knowledge distillation experiments which show that NaturalReasoning can effectively elicit and transfer reasoning capabilities from a strong teacher model. Furthermore, we demonstrate that NaturalReasoning is also effective for unsupervised self-training using external reward models or self-rewarding. To foster…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Multi-Agent Systems and Negotiation

MethodsKnowledge Distillation