Function Approximation for Solving Stackelberg Equilibrium in Large Perfect Information Games
Chun Kai Ling, J. Zico Kolter, Fei Fang

TL;DR
This paper introduces a novel neural network-based function approximation method for solving large general-sum extensive-form games by learning the Enforceable Payoff Frontier, enabling scalable computation of Stackelberg equilibria with performance guarantees.
Contribution
It proposes the first function approximation approach for Stackelberg extensive-form games, generalizing value functions to handle complex equilibria in large games.
Findings
Successfully scales to larger games than previous methods
Guarantees incentive compatibility in the learned equilibria
Achieves performance bounds based on approximation error
Abstract
Function approximation (FA) has been a critical component in solving large zero-sum games. Yet, little attention has been given towards FA in solving \textit{general-sum} extensive-form games, despite them being widely regarded as being computationally more challenging than their fully competitive or cooperative counterparts. A key challenge is that for many equilibria in general-sum games, no simple analogue to the state value function used in Markov Decision Processes and zero-sum games exists. In this paper, we propose learning the \textit{Enforceable Payoff Frontier} (EPF) -- a generalization of the state value function for general-sum games. We approximate the optimal \textit{Stackelberg extensive-form correlated equilibrium} by representing EPFs with neural networks and training them by using appropriate backup operations and loss functions. This is the first method that applies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGame Theory and Applications · Sports Analytics and Performance · Reinforcement Learning in Robotics
MethodsFeedback Alignment
