AdapThink: Adaptive Thinking Preferences for Reasoning Language Model

Xu Wan; Wei Wang; Wenyue Xu; Wotao Yin; Jie Song; Mingyang Sun

arXiv:2506.18237·cs.LG·June 24, 2025

AdapThink: Adaptive Thinking Preferences for Reasoning Language Model

Xu Wan, Wei Wang, Wenyue Xu, Wotao Yin, Jie Song, Mingyang Sun

PDF

TL;DR

AdapThink introduces an adaptive post-training framework for reasoning language models that dynamically balances reasoning depth and efficiency, improving performance on mathematical reasoning tasks.

Contribution

It proposes a novel adaptive mechanism combining confidence-based rewards and diversity-aware sampling to enhance reasoning efficiency and effectiveness.

Findings

01

Improves reasoning efficiency by reducing unnecessary computation.

02

Maintains or enhances accuracy on mathematical reasoning datasets.

03

Enables dynamic adjustment of reasoning depth based on question complexity.

Abstract

Reinforcement Learning (RL)-based post-training has significantly advanced the complex reasoning capabilities of language models, fostering sophisticated self-reflection processes. However, this ``slow thinking'' paradigm presents a critical challenge to reasoning efficiency: models may expend excessive computation on simple questions and shift reasoning prematurely for complex ones. Previous mechanisms typically rely on static length budgets or predefined rules, lacking the adaptability for varying question complexities and models' evolving capabilities. To this end, we propose AdapThink, an adaptive post-training framework designed to induce more efficient thinking while maintaining the performance of reasoning language models. Specifically, AdapThink incorporates two key mechanisms: 1) A group-relative reward function that leverages model confidence and response's characteristic to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.