A Survey on Self-play Methods in Reinforcement Learning

Ruize Zhang; Zelai Xu; Chengdong Ma; Chao Yu; Wei-Wei Tu; Wenhao Tang; Shiyu Huang; Deheng Ye; Wenbo Ding; Yaodong Yang; Yu Wang

arXiv:2408.01072·cs.AI·October 21, 2025

A Survey on Self-play Methods in Reinforcement Learning

Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Wenhao Tang, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang

PDF

Open Access

TL;DR

This survey provides a comprehensive overview of self-play methods in reinforcement learning, categorizing algorithms, discussing their applications in multi-agent tasks, and outlining future research challenges.

Contribution

It offers a unified framework for understanding self-play algorithms and bridges the gap between theory and practical applications in multi-agent reinforcement learning.

Findings

01

Classifies existing self-play algorithms within a unified framework

02

Highlights the role of self-play in complex multi-agent tasks like Go and poker

03

Identifies open challenges and future directions in self-play research

Abstract

Self-play, a learning paradigm where agents iteratively refine their policies by interacting with historical or concurrent versions of themselves or other evolving agents, has shown remarkable success in solving complex non-cooperative multi-agent tasks. Despite its growing prominence in multi-agent reinforcement learning (MARL), such as Go, poker, and video games, a comprehensive and structured understanding of self-play remains lacking. This survey fills this gap by offering a comprehensive roadmap to the diverse landscape of self-play methods. We begin by introducing the necessary preliminaries, including the MARL framework and basic game theory concepts. Then, it provides a unified framework and classifies existing self-play algorithms within this framework. Moreover, the paper bridges the gap between the algorithms and their practical implications by illustrating the role of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics