ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Ling Yang; Zhaochen Yu; Bin Cui; Mengdi Wang

arXiv:2502.06772·cs.CL·March 12, 2025·2 cites

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Ling Yang, Zhaochen Yu, Bin Cui, Mengdi Wang

PDF

Open Access 3 Repos 3 Models 5 Datasets

TL;DR

ReasonFlux introduces hierarchical thought templates and reinforcement learning to significantly enhance large language models' mathematical reasoning, achieving state-of-the-art accuracy on benchmarks with efficient training.

Contribution

The paper proposes a novel hierarchical reasoning framework with a structured template library, reinforcement learning for template planning, and an inference scaling system, improving LLM math reasoning.

Findings

01

Achieves 91.2% accuracy on MATH benchmark.

02

Solves 56.7% of AIME problems, surpassing prior models.

03

Uses only 8 GPUs for training ReasonFlux-32B.

Abstract

We present that hierarchical LLM reasoning via scaling thought templates can effectively optimize the reasoning search space and outperform the mathematical reasoning capabilities of powerful LLMs like OpenAI o1-preview and DeepSeek V3. We train our ReasonFlux-32B model with only 8 GPUs and introduces three innovations: (i) a structured and generic thought template library, containing around 500 high-level thought templates capable of generalizing to similar or relevant reasoning problems; (ii) performing hierarchical reinforcement learning on a sequence of thought templates instead of long CoTs, optimizing a base LLM to plan out an optimal template trajectory for gradually handling complex problems; (iii) a brand new inference scaling system that enables hierarchical LLM reasoning by adaptively scaling thought templates at inference time. With a template trajectory containing more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques

MethodsBalanced Selection