Evolution of Societies via Reinforcement Learning
Yann Bouteiller, Karthik Soma, Giovanni Beltrame

TL;DR
This paper introduces a scalable simulation framework for evolving large, heterogeneous populations of reinforcement learning agents, revealing how advanced learning strategies influence social evolution in classic games.
Contribution
It develops a fast, parallel implementation of Policy Gradient and LOLA algorithms for large-scale evolutionary simulations of MARL agents.
Findings
Large populations of 200,000 agents successfully simulated.
Opponent-Learning Awareness significantly impacts social evolution.
Different learning strategies lead to distinct evolutionary outcomes.
Abstract
The universe involves many independent co-learning agents as an ever-evolving part of our observed environment. Yet, in practice, Multi-Agent Reinforcement Learning (MARL) applications are typically constrained to small, homogeneous populations and remain computationally intensive. We propose a methodology that enables simulating populations of Reinforcement Learning agents at evolutionary scale. More specifically, we derive a fast, parallelizable implementation of Policy Gradient (PG) and Opponent-Learning Awareness (LOLA), tailored for evolutionary simulations where agents undergo random pairwise interactions in stateless normal-form games. We demonstrate our approach by simulating the evolution of very large populations made of heterogeneous co-learning agents, under both naive and advanced learning strategies. In our experiments, 200,000 PG or LOLA agents evolve in the classic games…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research · Computability, Logic, AI Algorithms
