One Tool Is Enough: Reinforcement Learning for Repository-Level LLM Agents

Zhaoxi Zhang; Yitong Duan; Yanzhi Zhang; Yiming Xu; Zhixiang Wang; Kun Liang; Yang Li; Jiahui Liang; Deguo Xia; Jizhou Huang; Jiyan He; Yunfang Wu

arXiv:2512.20957·cs.SE·January 27, 2026

One Tool Is Enough: Reinforcement Learning for Repository-Level LLM Agents

Zhaoxi Zhang, Yitong Duan, Yanzhi Zhang, Yiming Xu, Zhixiang Wang, Kun Liang, Yang Li, Jiahui Liang, Deguo Xia, Jizhou Huang, Jiyan He, Yunfang Wu

PDF

Open Access

TL;DR

This paper introduces RepoNavigator, a reinforcement learning-trained LLM agent with a single execution-aware tool for efficient repository-level issue localization, outperforming larger models and existing methods.

Contribution

It presents a novel, unified LLM agent design with RL training that simplifies tool use and improves performance in large software repositories.

Findings

01

7B model outperforms 14B baselines

02

14B model surpasses 32B competitors

03

32B model exceeds GPT-5 on most metrics

Abstract

Locating files and functions requiring modification in large software repositories is challenging due to their scale and structural complexity. Existing LLM-based methods typically treat this as a repository-level retrieval task and rely on multiple auxiliary tools, which often overlook code execution logic and complicate model control. We propose RepoNavigator, an LLM agent equipped with a single execution-aware tool: jumping to the definition of an invoked symbol. This unified design reflects the actual flow of code execution while simplifying tool manipulation. RepoNavigator is trained end-to-end via Reinforcement Learning (RL) directly from a base pretrained model, without relying on closed-source distillation. Experiments demonstrate that RL-trained RepoNavigator achieves state-of-the-art performance, with the 7B model outperforming 14B baselines, the 14B model surpassing 32B…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software System Performance and Reliability · Web Data Mining and Analysis