Status-quo policy gradient in Multi-Agent Reinforcement Learning
Pinkesh Badjatiya, Mausoom Sarkar, Nikaash Puri, Jayakumar, Subramanian, Abhishek Sinha, Siddharth Singh, Balaji Krishnamurthy

TL;DR
This paper introduces a status-quo bias-based loss function for reinforcement learning agents, enabling them to learn high-utility strategies in social dilemmas and complex multi-agent environments, outperforming existing methods.
Contribution
The paper proposes a novel status-quo loss (SQLoss) and policy gradient algorithm that incorporate human-like bias to improve multi-agent RL performance in social dilemmas.
Findings
SQLoss enables high-utility policies in social dilemma matrix games.
SQLoss outperforms state-of-the-art methods in visual input non-matrix games.
SQLoss promotes cooperative behavior in multi-agent settings like Braess' paradox.
Abstract
Individual rationality, which involves maximizing expected individual returns, does not always lead to high-utility individual or group outcomes in multi-agent problems. For instance, in multi-agent social dilemmas, Reinforcement Learning (RL) agents trained to maximize individual rewards converge to a low-utility mutually harmful equilibrium. In contrast, humans evolve useful strategies in such social dilemmas. Inspired by ideas from human psychology that attribute this behavior to the status-quo bias, we present a status-quo loss (SQLoss) and the corresponding policy gradient algorithm that incorporates this bias in an RL agent. We demonstrate that agents trained with SQLoss learn high-utility policies in several social dilemma matrix games (Prisoner's Dilemma, Stag Hunt matrix variant, Chicken Game). We show how SQLoss outperforms existing state-of-the-art methods to obtain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExperimental Behavioral Economics Studies · Evolutionary Game Theory and Cooperation · Reinforcement Learning in Robotics
