MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution
Yibo Wang, Zhihao Peng, Ying Wang, Zhao Wei, Hai Yu, Zhiliang Zhu

TL;DR
This paper introduces MCTS-Refined Chain-of-Thought, a novel method for generating high-quality fine-tuning data for LLMs in repository issue resolution, significantly improving their reasoning and accuracy.
Contribution
It proposes an MCTS-based algorithm with rejection sampling and step validation to produce superior CoT data for LLM fine-tuning in software issue resolution.
Findings
LLMs fine-tuned with our CoT data outperform baselines in resolution rates.
Qwen2.5-72B-Instruct achieves 28.3% and 35.0% resolution rates on SWE-bench Lite and Verified.
Our method surpasses state-of-the-art models of similar size.
Abstract
LLMs demonstrate strong performance in auto-mated software engineering, particularly for code generation and issue resolution. While proprietary models like GPT-4o achieve high benchmarks scores on SWE-bench, their API dependence, cost, and privacy concerns limit adoption. Open-source alternatives offer transparency but underperform in complex tasks, especially sub-100B parameter models. Although quality Chain-of-Thought (CoT) data can enhance reasoning, current methods face two critical flaws: (1) weak rejection sampling reduces data quality, and (2) inadequate step validation causes error accumulation. These limitations lead to flawed reasoning chains that impair LLMs'ability to learn reliable issue resolution. The paper proposes MCTS-REFINE, an enhanced Monte Carlo Tree Search (MCTS)-based algorithm that dynamically validates and optimizes intermediate reasoning steps through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
