Reasoning and Tool-use Compete in Agentic RL:From Quantifying Interference to Disentangled Tuning
Yu Li, Mingyang Yi, Xiuyu Li, Ju Fan, Fuxin Jiang, Binbin Chen, Peng Li, Jie Song, Tieying Zhang

TL;DR
This paper investigates the interference between reasoning and tool-use in agentic reinforcement learning, revealing that joint training can hinder performance, and proposes a disentangled tuning method to improve outcomes.
Contribution
It introduces LEAS for quantifying interference and DART for decoupling reasoning and tool-use training, improving agent performance.
Findings
LEAS quantifies interference between reasoning and tool-use.
DART outperforms baseline methods by 6.35% on average.
DART achieves performance comparable to multi-agent systems.
Abstract
Agentic Reinforcement Learning (ARL) focuses on training large language models (LLMs) to interleave reasoning with external tool execution to solve complex tasks. Most existing ARL methods train a single shared model parameters to support both reasoning and tool use behaviors, implicitly assuming that joint training leads to improved overall agent performance. Despite its widespread adoption, this assumption has rarely been examined empirically. In this paper, we systematically investigate this assumption by introducing a Linear Effect Attribution System(LEAS), which provides quantitative evidence of interference between reasoning and tool-use behaviors. Through an in-depth analysis, we show that these two capabilities often induce misaligned gradient directions, leading to training interference that undermines the effectiveness of joint optimization and challenges the prevailing ARL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Topic Modeling
