A Unified Approach to Reinforcement Learning, Quantal Response   Equilibria, and Two-Player Zero-Sum Games

Samuel Sokota; Ryan D'Orazio; J. Zico Kolter; Nicolas Loizou; Marc; Lanctot; Ioannis Mitliagkas; Noam Brown; Christian Kroer

arXiv:2206.05825·cs.LG·April 12, 2023·6 cites

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc, Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces magnetic mirror descent, a novel algorithm that effectively solves equilibrium problems and enhances reinforcement learning in two-player zero-sum games, demonstrating superior convergence and empirical performance.

Contribution

The paper presents magnetic mirror descent, the first to achieve linear convergence for extensive-form games and competitive results with CFR in tabular RL, along with successful deep RL applications.

Findings

01

Linear convergence for extensive-form games with first order feedback

02

Empirically competitive results with CFR in tabular settings

03

Effective self-play deep RL in Dark Hex and Phantom Tic-Tac-Toe

Abstract

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equilibria solver to achieve linear convergence for extensive-form games with first order feedback; 2) Being the first standard reinforcement learning algorithm to achieve empirically competitive results with CFR in tabular settings; 3) Achieving favorable performance in 3x3 Dark Hex and Phantom Tic-Tac-Toe as a self-play deep reinforcement learning algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games· slideslive

Taxonomy

TopicsExperimental Behavioral Economics Studies · Game Theory and Applications · Reinforcement Learning in Robotics

MethodsEntropy Regularization · Proximal Policy Optimization