ACE: Self-Evolving LLM Coding Framework via Adversarial Unit Test Generation and Preference Optimization

Yixu Huang; Xinglei Yu; Zhongyu Wei

arXiv:2605.16299·cs.SE·May 22, 2026

ACE: Self-Evolving LLM Coding Framework via Adversarial Unit Test Generation and Preference Optimization

Yixu Huang, Xinglei Yu, Zhongyu Wei

PDF

TL;DR

ACE introduces a self-evolving LLM framework that uses adversarial test generation and execution-based supervision to improve code generation without relying on ground-truth data.

Contribution

It presents a novel solver--adversary architecture enabling self-improvement through active failure discovery and execution-centric supervision without external ground-truth or reward models.

Findings

01

Achieves 3-7% absolute gains in pass@1 over baselines.

02

Outperforms on out-of-distribution benchmarks.

03

Maintains competitive inference efficiency.

Abstract

Large Language Models (LLMs) excel at code generation but remain heavily reliant on large-scale annotated solutions and verification-based supervision, which constrains scalability and hinders sustained self-improvement. Recent solver--verifier frameworks exploit program execution as an automatic supervision signal, but their effectiveness degrades as solvers become moderately strong: verifier-generated tests increasingly confirm semantic correctness rather than exposing the remaining failure modes. We propose \textbf{ACE}, a self-evolving code generation framework based on a solver--adversary architecture that prioritizes active failure discovery through execution-centric supervision. A single LLM alternates between generating candidate programs and producing adversarial unit test inputs optimized to induce execution-level failures, such as runtime errors, exceptions, or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.