Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore, Graepel

TL;DR
This paper introduces sequential social dilemmas modeled as Markov games, where agents learn policies over time, revealing how environmental factors influence cooperation and conflict in multi-agent settings.
Contribution
It extends social dilemma research to sequential settings with policy-based decisions, analyzing dynamics with deep reinforcement learning in new Markov game environments.
Findings
Conflict arises from resource competition.
Environmental factors affect cooperation levels.
Sequential nature influences strategic behavior.
Abstract
Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic action. In real-world social dilemmas these choices are temporally extended. Cooperativeness is a property that applies to policies, not elementary actions. We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions. We analyze the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network, on two Markov games we introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Game Theory and Cooperation · Experimental Behavioral Economics Studies · Mathematical and Theoretical Epidemiology and Ecology Models
