Solving Games with Functional Regret Estimation

Kevin Waugh; Dustin Morrill; J. Andrew Bagnell; Michael; Bowling

arXiv:1411.7974·cs.AI·January 5, 2015

Solving Games with Functional Regret Estimation

Kevin Waugh, Dustin Morrill, J. Andrew Bagnell, Michael, Bowling

PDF

TL;DR

This paper introduces a new online learning method that uses function approximation to estimate regrets in large extensive-form games, enabling convergence to Nash equilibrium through self-play.

Contribution

It presents a novel regret estimation approach that learns both abstractions and strategies during self-play, improving over existing methods.

Findings

01

Achieves higher quality strategies than state-of-the-art abstraction techniques

02

Guarantees convergence to Nash equilibrium in self-play with accurate regret approximation

03

Provides theoretical bounds relating function approximation quality to regret minimization

Abstract

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.