ARM: Adaptive Reasoning Model
Siye Wu, Jian Xie, Yikai Zhang, Aili Chen, Kai Zhang, Yu Su, Yanghua Xiao

TL;DR
The paper introduces ARM, an adaptive reasoning model that dynamically selects reasoning formats to improve token efficiency and speed, while maintaining performance on complex tasks.
Contribution
ARM is the first model to adaptively choose reasoning formats based on task difficulty, reducing token usage and training time with Ada-GRPO optimization.
Findings
Achieves 30-70% token reduction without performance loss
Doubles training speed compared to non-adaptive models
Supports multiple reasoning modes for flexibility
Abstract
While large reasoning models demonstrate strong performance on complex tasks, they lack the ability to adjust reasoning token usage based on task difficulty. This often leads to the "overthinking" problem -- excessive and unnecessary reasoning -- which, although potentially mitigated by human intervention to control the token budget, still fundamentally contradicts the goal of achieving fully autonomous AI. In this work, we propose Adaptive Reasoning Model (ARM), a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand. These formats include three efficient ones -- Direct Answer, Short CoT, and Code -- as well as a more elaborate format, Long CoT. To train ARM, we introduce Ada-GRPO, an adaptation of Group Relative Policy Optimization (GRPO), which addresses the format collapse issue in traditional GRPO. Ada-GRPO enables ARM to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFuzzy Logic and Control Systems · AI-based Problem Solving and Planning
