On Gradient-Based Learning in Continuous Games

Eric Mazumdar; Lillian J. Ratliff; S. Shankar Sastry

arXiv:1804.05464·cs.LG·February 21, 2020

On Gradient-Based Learning in Continuous Games

Eric Mazumdar, Lillian J. Ratliff, S. Shankar Sastry

PDF

TL;DR

This paper develops a unified framework for analyzing gradient-based learning in multi-agent games, revealing how certain equilibria are avoided and explaining convergence issues in applications like GAN training.

Contribution

It introduces a general dynamical systems framework for multi-agent learning, characterizes equilibria avoidance, and explains convergence problems in gradient-based algorithms.

Findings

01

Certain local Nash equilibria are avoided by gradient-based learning.

02

Gradient algorithms can converge to non-Nash strategies in some games.

03

Empirical results show policy gradient often avoids global Nash in quadratic games.

Abstract

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory. For both general-sum and potential games, we characterize a non-negligible subset of the local Nash equilibria that will be avoided if each agent employs a gradient-based learning algorithm. We also shed light on the issue of convergence to non-Nash strategies in general- and zero-sum games, which may have no relevance to the underlying game, and arise solely due to the choice of algorithm. The existence and frequency of such strategies may explain some of the difficulties encountered when using gradient descent in zero-sum games as, e.g., in the training of generative adversarial networks. To reinforce the theoretical contributions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.