Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning

Aozhe Wang; Yuchen Yan; Nan Zhou; Zhengxi Lu; Weiming Lu; Jun Xiao; Yueting Zhuang; Yongliang Shen

arXiv:2603.15611·cs.CL·March 17, 2026

Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning

Aozhe Wang, Yuchen Yan, Nan Zhou, Zhengxi Lu, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen

PDF

Open Access

TL;DR

Code-A1 introduces an adversarial co-evolution framework with separate code and test language models, enhancing code generation and test creation by avoiding self-collusion and enabling targeted, white-box testing.

Contribution

It proposes a novel adversarial co-evolution approach with separate models for code and test generation, improving over existing self-play methods by avoiding trivial tests and enabling targeted adversarial testing.

Findings

01

Achieves code generation performance comparable or superior to models trained on human tests.

02

Significantly improves test generation capability.

03

Effectively balances test validity with adversarial difficulty.

Abstract

Reinforcement learning for code generation relies on verifiable rewards from unit test pass rates. Yet high-quality test suites are scarce, existing datasets offer limited coverage, and static rewards fail to adapt as models improve. Recent self-play methods unify code and test generation in a single model, but face a inherent dilemma: white-box access leads to self-collusion where the model produces trivial tests for easy rewards, yet black-box restriction yields generic tests that miss implementation-specific bugs. We introduce Code-A1, an adversarial co-evolution framework that jointly optimizes a Code LLM and a Test LLM with opposing objectives. The Code LLM is rewarded for passing more tests, while the Test LLM is rewarded for exposing more defects. This architectural separation eliminates self-collusion risks and safely enables white-box test generation, where the Test LLM can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms