Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Bowen Baker

TL;DR
This paper introduces RUSP, a novel environment augmentation for multi-agent reinforcement learning, enabling agents to develop social behaviors like reciprocity and team formation, improving cooperation in complex social dilemmas.
Contribution
The paper presents RUSP, a scalable environment augmentation that fosters emergent social behaviors in MARL without altering original game dynamics.
Findings
Emergent reciprocity and reputation behaviors observed
Higher social welfare equilibria achieved in social dilemmas
Applicable to various multi-agent environments
Abstract
Multi-agent reinforcement learning (MARL) has shown recent success in increasingly complex fixed-team zero-sum environments. However, the real world is not zero-sum nor does it have fixed teams; humans face numerous social dilemmas and must learn when to cooperate and when to compete. To successfully deploy agents into the human world, it may be important that they be able to understand and help in our conflicts. Unfortunately, selfish MARL agents typically fail when faced with social dilemmas. In this work, we show evidence of emergent direct reciprocity, indirect reciprocity and reputation, and team formation when training agents with randomized uncertain social preferences (RUSP), a novel environment augmentation that expands the distribution of environments agents play in. RUSP is generic and scalable; it can be applied to any multi-agent environment without changing the original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Game Theory and Cooperation · Experimental Behavioral Economics Studies · Game Theory and Applications
