AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
Yuliang Liu, Junjie Lu, Zhaoling Chen, Chaofeng Qu, Jason Klein Liu, Chonghan Liu, Zefan Cai, Yunhui Xia, Li Zhao, Jiang Bian, Chuheng Zhang, Wei Shen, Zhouhan Lin

TL;DR
AdaptiveStep introduces a confidence-based approach to dividing reasoning steps in process reward models, improving performance and efficiency in mathematical reasoning and code generation tasks without manual annotations.
Contribution
It presents a novel confidence-driven method for dividing reasoning steps, enhancing reward model training and outperforming existing strategies in key tasks.
Findings
State-of-the-art performance in mathematical reasoning and code generation
Reduces construction costs by over 30%
Improves transferability and generalization of reward models
Abstract
Current approaches for training Process Reward Models (PRMs) often involve breaking down responses into multiple reasoning steps using rule-based techniques, such as using predefined placeholder tokens or setting the reasoning step's length into a fixed size. These approaches overlook the fact that specific words do not typically mark true decision points in a text. To address this, we propose AdaptiveStep, a method that divides reasoning steps based on the model's confidence in predicting the next word. This division method provides more decision-making information at each step, enhancing downstream tasks, such as reward model learning. Moreover, our method does not require manual annotation. We demonstrate its effectiveness through experiments with AdaptiveStep-trained PRMs in mathematical reasoning and code generation tasks. Experimental results indicate that the outcome PRM achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Intelligent Tutoring Systems and Adaptive Learning · Business Process Modeling and Analysis
