Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu, Tengyu Xu, Di Jin, Karthik Abinav Sankararaman, Yun He,, Wenxuan Zhou, Zhouhao Zeng, Eryk Helenowski, Chen Zhu, Sinong Wang, Hao Ma,, Han Fang

TL;DR
This paper introduces IBPO, a method enabling language models to allocate inference resources adaptively based on question difficulty, significantly improving math problem-solving performance under constrained inference budgets.
Contribution
The paper proposes IBPO, a novel inference-aware optimization technique that allows models to dynamically allocate inference budgets, enhancing reasoning efficiency and accuracy.
Findings
Achieved up to 5.74% absolute improvement on MATH500 with increased inference budgets.
Models learned to allocate more inference resources to harder questions.
Performance gains are about twice those of self-consistency under the same budgets.
Abstract
Solving mathematics problems has been an intriguing capability of large language models, and many efforts have been made to improve reasoning by extending reasoning length, such as through self-correction and extensive long chain-of-thoughts. While promising in problem-solving, advanced long reasoning chain models exhibit an undesired single-modal behavior, where trivial questions require unnecessarily tedious long chains of thought. In this work, we propose a way to allow models to be aware of inference budgets by formulating it as utility maximization with respect to an inference budget constraint, hence naming our algorithm Inference Budget-Constrained Policy Optimization (IBPO). In a nutshell, models fine-tuned through IBPO learn to ``understand'' the difficulty of queries and allocate inference budgets to harder ones. With different inference budgets, our best models are able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Intelligent Tutoring Systems and Adaptive Learning · Data Stream Mining Techniques
MethodsAttentive Walk-Aggregating Graph Neural Network
