ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Huaye Zeng; Dongfu Jiang; Haozhe Wang; Ping Nie; Xiaotong Chen; Wenhu Chen

arXiv:2502.01718·cs.SE·May 27, 2025

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Huaye Zeng, Dongfu Jiang, Haozhe Wang, Ping Nie, Xiaotong Chen, Wenhu Chen

PDF

Open Access 5 Models 5 Datasets 1 Video

TL;DR

This paper introduces ACECODER, a method that uses automated test-case synthesis and reinforcement learning to significantly improve code generation models, demonstrating notable performance gains across multiple benchmarks.

Contribution

We propose a novel pipeline for automated test-case synthesis and reward modeling to enhance reinforcement learning in code models, achieving state-of-the-art improvements.

Findings

01

10-point improvement for Llama-3.1-8B-Ins

02

5-point improvement for Qwen2.5-Coder-7B-Ins

03

Over 25% improvement on HumanEval-plus

Abstract

Most progress in recent coder models has been driven by supervised fine-tuning (SFT), while the potential of reinforcement learning (RL) remains largely unexplored, primarily due to the lack of reliable reward data/model in the code domain. In this paper, we address this challenge by leveraging automated large-scale test-case synthesis to enhance code model training. Specifically, we design a pipeline that generates extensive (question, test-cases) pairs from existing code data. Using these test cases, we construct preference pairs based on pass rates over sampled programs to train reward models with Bradley-Terry loss. It shows an average of 10-point improvement for Llama-3.1-8B-Ins and 5-point improvement for Qwen2.5-Coder-7B-Ins through best-of-32 sampling, making the 7B model on par with 236B DeepSeek-V2.5. Furthermore, we conduct reinforcement learning with both reward models and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

ACECODER: Acing Coder RL via Automated Test-Case Synthesis· underline

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Model-Driven Software Engineering Techniques · Real-time simulation and control systems