Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement

Xiaoqing Zhang; Yuhan Liu; Flood Sung; Xiuying Chen; Shuo Shang; and Rui Yan

arXiv:2502.17442·cs.SE·May 28, 2025

Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement

Xiaoqing Zhang, Yuhan Liu, Flood Sung, Xiuying Chen, Shuo Shang, and Rui Yan

PDF

Open Access 1 Video

TL;DR

ThinkCoder introduces a two-phase code generation framework combining exploration and refinement, enhanced by preference-driven optimization, achieving high accuracy with reduced computational costs on benchmarks like HumanEval and MBPP.

Contribution

The paper presents a novel framework that integrates thorough exploration, optimal refinement, and preference-driven learning to improve code generation efficiency and accuracy.

Findings

01

Improves Pass@1 by 3.0% over MapCoder with 6.4% of the computation cost.

02

Achieves higher Pass@1 than AgentCoder after fewer rounds.

03

Enables LLaMA2-7B to perform competitively using only 20% of resources.

Abstract

Code generation is crucial in software engineering for automating the coding process efficiently. While test-time computation methods show promise, they suffer from high latency due to multiple computation rounds. To overcome this, we introduce \textbf{ThinkCoder}, a framework that combines thorough exploration with optimal refinement. The exploration phase diversifies the solution space by searching for potential solutions, followed by a refinement phase that enhances precision. This approach allows us to select the best solution through careful consideration before taking action, avoiding excessive trial and error. To further minimize test-time computation overhead, we introduce preference-driven optimization with Reinforced Self-Training (ReST), which uses exploration trajectories from ThinkCoder to guide LLM's evolution. This approach enhances LLM's exploration efficiency via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement· underline

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Model-Driven Software Engineering Techniques · Real-time simulation and control systems

MethodsBalanced Selection