AdaTIR: Adaptive Tool-Integrated Reasoning via Difficulty-Aware Policy Optimization
Zhaiyu Fang, Ruipeng Sun

TL;DR
AdaTIR introduces a dynamic, difficulty-aware reasoning framework for LLMs that reduces unnecessary tool use and internalizes simple reasoning, significantly improving efficiency and accuracy.
Contribution
It proposes AdaTIR, a novel framework that adaptively manages tool invocation based on task difficulty, and introduces Clipped Advantage Shaping to address reward sign reversal issues.
Findings
Reduces tool calls by up to 97.6% on simple tasks
Maintains or improves accuracy with fewer tool invocations
Outperforms baselines by 4.8% on AIME 2024 without tool access
Abstract
Tool-Integrated Reasoning (TIR) has significantly enhanced the capabilities of Large Language Models (LLMs), yet current agents tend to exhibit cognitive offloading, redundantly invoking external tools even for simple tasks. In this paper, we suggest that true agentic intelligence requires not just tool invocation, but the adaptive wisdom to discern when to use them. We propose AdaTIR, a framework that shifts the paradigm from static tool invocation to difficulty-aware reasoning internalization. By introducing a difficulty-aware efficiency reward, AdaTIR dynamically adjusts tool budgets based on task complexity--internalizing reasoning for simple tasks while selectively invoking tools for complex tasks. Furthermore, we identify a sign reversal problem where tool penalties outweigh correctness rewards, mistakenly penalizing correct rollouts with negative advantages. To resolve this, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education · Reinforcement Learning in Robotics
