Learning Generalizable Risk-Sensitive Policies to Coordinate in   Decentralized Multi-Agent General-Sum Games

Ziyi Liu; Xian Guo; Yongchun Fang

arXiv:2205.15859·cs.MA·January 5, 2023

Learning Generalizable Risk-Sensitive Policies to Coordinate in Decentralized Multi-Agent General-Sum Games

Ziyi Liu, Xian Guo, Yongchun Fang

PDF

Open Access

TL;DR

This paper introduces GRSP, a novel multi-agent reinforcement learning method that enables self-interested agents to learn risk-sensitive, generalizable coordination strategies in decentralized general-sum games, effective against non-cooperative opponents.

Contribution

GRSP is the first approach to learn coordination strategies in IPD and ISH without shaping opponents or rewards, incorporating generalization during execution and scaling to high-dimensional environments.

Findings

01

Agents trained with GRSP achieve stable mutual coordination.

02

GRSP prevents exploitation by non-cooperative opponents.

03

Method is scalable to high-dimensional settings.

Abstract

While various multi-agent reinforcement learning methods have been proposed in cooperative settings, few works investigate how self-interested learning agents achieve mutual coordination in decentralized general-sum games and generalize pre-trained policies to non-cooperative opponents during execution. In this paper, we present Generalizable Risk-Sensitive Policy (GRSP). GRSP learns the distributions over agent's return and estimate a dynamic risk-seeking bonus to discover risky coordination strategies. Furthermore, to avoid overfitting to training opponents, GRSP learns an auxiliary opponent modeling task to infer opponents' types and dynamically alter corresponding strategies during execution. Empirically, agents trained via GRSP can achieve mutual coordination during training stably and avoid being exploited by non-cooperative opponents during execution. To the best of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCrime, Illicit Activities, and Governance · Experimental Behavioral Economics Studies · Evolutionary Game Theory and Cooperation