Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search

Guochang Li; Yuchen Liu; Zhen Qin; Yunkun Wang; Jianping Zhong; Chen Zhi; Binhua Li; Fei Huang; Yongbin Li; Shuiguang Deng

arXiv:2510.26287·cs.SE·October 31, 2025

Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search

Guochang Li, Yuchen Liu, Zhen Qin, Yunkun Wang, Jianping Zhong, Chen Zhi, Binhua Li, Fei Huang, Yongbin Li, Shuiguang Deng

PDF

TL;DR

This paper introduces RepoSearch-R1, a reinforcement learning framework driven by Monte-carlo Tree Search, enabling efficient, data-compliant repository question-answering without model distillation or external supervision.

Contribution

The paper presents a novel agentic reinforcement learning approach using MCTS that improves repository QA performance and training efficiency without relying on costly distillation or external data.

Findings

01

16.0% improvement over no-retrieval methods

02

19.5% improvement over iterative retrieval methods

03

33% increase in training efficiency

Abstract

Repository-level software engineering tasks require large language models (LLMs) to efficiently navigate and extract information from complex codebases through multi-turn tool interactions. Existing approaches face significant limitations: training-free, in-context learning methods struggle to guide agents effectively in tool utilization and decision-making based on environmental feedback, while training-based approaches typically rely on costly distillation from larger LLMs, introducing data compliance concerns in enterprise environments. To address these challenges, we introduce RepoSearch-R1, a novel agentic reinforcement learning framework driven by Monte-carlo Tree Search (MCTS). This approach allows agents to generate diverse, high-quality reasoning trajectories via self-training without requiring model distillation or external supervision. Based on RepoSearch-R1, we construct a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.