A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law
Qianjun Pan, Wenkai Ji, Yuyang Ding, Junsong Li, Shilian, Chen, Junyi Wang, Jie Zhou, Qin Chen, Min Zhang, Yulan Wu and, Liang He

TL;DR
This survey reviews recent developments in reasoning large language models inspired by human slow thinking, emphasizing dynamic scaling, reinforcement learning, and structured problem-solving to enhance reasoning capabilities.
Contribution
It synthesizes over 100 studies to categorize methods like test-time scaling, reinforcement learning, and slow-thinking frameworks, outlining a comprehensive view of current advancements.
Findings
Dynamic test-time scaling improves reasoning efficiency.
Reinforcement learning refines decision-making in LLMs.
Structured slow-thinking methods enhance complex problem-solving.
Abstract
This survey explores recent advancements in reasoning large language models (LLMs) designed to mimic "slow thinking" - a reasoning process inspired by human cognition, as described in Kahneman's Thinking, Fast and Slow. These models, like OpenAI's o1, focus on scaling computational resources dynamically during complex tasks, such as math reasoning, visual reasoning, medical diagnosis, and multi-agent debates. We present the development of reasoning LLMs and list their key technologies. By synthesizing over 100 studies, it charts a path toward LLMs that combine human-like deep thinking with scalable efficiency for reasoning. The review breaks down methods into three categories: (1) test-time scaling dynamically adjusts computation based on task complexity via search and sampling, dynamic verification; (2) reinforced learning refines decision-making through iterative improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Constraint Satisfaction and Optimization
MethodsFocus
