Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models
Yongjiang Liu, Haoxi Li, Xiaosong Ma, Jie Zhang, Song Guo

TL;DR
This paper introduces TH2T, a two-stage fine-tuning method that enhances large reasoning models' ability to recognize task difficulty and reduce redundant reasoning, significantly decreasing inference costs while maintaining performance.
Contribution
The paper proposes a novel two-stage fine-tuning strategy, TH2T, that improves LRMs' difficulty and redundancy cognition to mitigate overthinking and reduce inference costs.
Findings
Reduces inference costs by over 70% on easy tasks
Maintains performance stability across models
Enhances difficulty-aware reasoning and reduces redundancy
Abstract
Recent Large Reasoning Models (LRMs) excel at complex reasoning tasks but often suffer from overthinking, generating overly long and redundant reasoning trajectories. To explore its essence, our empirical analysis reveals that LRMs are primarily limited to recognizing task properties (i.e., difficulty levels) like humans before solving the problem, leading to a one-size-fits-all reasoning process. Inspired by this, a pressing and natural question emerges: Can we explicitly bootstrap such ability to alleviate overthinking in LRMs? In this paper, we propose Think-How-to-Think (TH2T), a novel two-stage fine-tuning strategy that progressively inspires LRMs' difficulty cognition and redundancy cognition of LRMs. Specifically, we first inject difficulty hypnosis into output prefixes to guide the model toward adaptive reasoning depth, trained on a hybrid dataset mixing short and long reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
