Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain

Wei Liu; Siya Qi; Yali Du; Yulan He

arXiv:2603.02218·cs.LG·May 19, 2026

Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain

Wei Liu, Siya Qi, Yali Du, Yulan He

PDF

TL;DR

This paper demonstrates that sustainable self-evolution of large language models requires a self-synthetic data pipeline that ensures increasing learnable information across iterations, involving roles like proposer, solver, and verifier.

Contribution

It introduces a triadic roles framework and system designs that promote learnable information gain, enabling sustained self-evolution of LLMs.

Findings

01

Asymmetric co-evolution enhances learnable information across roles.

02

Capacity growth aligns model resources with increasing information.

03

Proactive information seeking prevents saturation and supports continuous improvement.

Abstract

Large language models (LLMs) make it plausible to build systems that improve through self-evolving loops, but many existing proposals are better understood as self-play and often plateau quickly. A central failure mode is that the loop synthesises more data without increasing learnable information for the next iteration. Through experiments on a self-play coding task, we reveal that sustainable self-evolution requires a self-synthesised data pipeline with learnable information that increases across iterations. We identify triadic roles that self-evolving LLMs play: the Proposer, which generates tasks; the Solver, which attempts solutions; and the Verifier, which provides training signals, and we identify three system designs that jointly target learnable information gain from this triadic roles perspective. Asymmetric co-evolution closes a weak-to-strong-to-weak loop across roles.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Machine Learning in Materials Science · Topic Modeling