RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation
Yuanyuan Lin, Xiangyu Ouyang, Teng Zhang, Kaixin Sui

TL;DR
RPM-MCTS introduces a knowledge-retrieval based Monte Carlo Tree Search approach to improve code generation by evaluating intermediate steps, reducing errors, and lowering computational costs, outperforming existing methods on multiple benchmarks.
Contribution
The paper presents RPM-MCTS, a novel method that leverages knowledge retrieval and sandbox feedback within MCTS to enhance code generation without complex reward model training.
Findings
Outperforms state-of-the-art on four benchmarks
Reduces token consumption by approximately 15%
Enhances code capabilities through fine-tuning with RPM-MCTS data
Abstract
Tree search-based methods have made significant progress in enhancing the code generation capabilities of large language models. However, due to the difficulty in effectively evaluating intermediate algorithmic steps and the inability to locate and timely correct erroneous steps, these methods often generate incorrect code and incur increased computational costs. To tackle these problems, we propose RPM-MCTS, an effective method that utilizes Knowledge-Retrieval as Process Reward Model based on Monte Carlo Tree Search to evaluate intermediate algorithmic steps. By utilizing knowledge base retrieval, RPM-MCTS avoids the complex training of process reward models. During the expansion phase, similarity filtering is employed to remove redundant nodes, ensuring diversity in reasoning paths. Furthermore, our method utilizes sandbox execution feedback to locate erroneous algorithmic steps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Natural Language Processing Techniques · Topic Modeling
