RATIONALYST: Mining Implicit Rationales for Process Supervision of Reasoning

Dongwei Jiang; Guoxuan Wang; Yining Lu; Andrew Wang; Jingyu Zhang; Chuyu Liu; Benjamin Van Durme; Daniel Khashabi

arXiv:2410.01044·cs.AI·June 17, 2025

RATIONALYST: Mining Implicit Rationales for Process Supervision of Reasoning

Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi

PDF

Open Access 1 Repo 1 Models

TL;DR

RATIONALYST is a model trained on a large collection of implicit rationales from unlabeled data, enabling improved reasoning accuracy across diverse tasks, outperforming larger models and verifiers.

Contribution

It introduces a web-scale pre-training approach for reasoning by extracting rationales from unlabeled data, enhancing reasoning performance of LLMs.

Findings

01

Achieves 3.9% average improvement on reasoning benchmarks.

02

Generalizes well across mathematical, scientific, and logical reasoning.

03

Outperforms larger models like GPT-4 in reasoning tasks.

Abstract

The reasoning steps generated by LLMs might be incomplete, as they mimic logical leaps common in everyday communication found in their pre-training data: underlying rationales are frequently left implicit (unstated). To address this challenge, we introduce RATIONALYST, a model for process-supervision of reasoning based on pre-training on a vast collection of rationale annotations extracted from unlabeled data. We extract 79k rationales from web-scale unlabelled dataset (the Pile) and a combination of reasoning datasets with minimal human intervention. This web-scale pre-training for reasoning allows RATIONALYST to consistently generalize across diverse reasoning tasks, including mathematical, commonsense, scientific, and logical reasoning. Fine-tuned from LLaMa-3-8B, RATIONALYST improves the accuracy of reasoning by an average of 3.9% on 7 representative reasoning benchmarks. It also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jhu-clsp/rationalyst
pytorchOfficial

Models

🤗
Dongwei/Rationalyst_reasoning_datasets
model· 36 dl· ♡ 4
36 dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Layer Normalization · Dense Connections · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding