Efficient Episodic Learning of Nonstationary and Unknown Zero-Sum Games Using Expert Game Ensembles
Yunian Pan, Quanyan Zhu

TL;DR
This paper introduces OFULinMat, an episodic learning algorithm for nonstationary zero-sum games that uses expert game ensembles to adaptively estimate game models and strategies, achieving sublinear regret.
Contribution
It presents a novel algorithm combining ensemble-based model estimation with strategy learning in nonstationary zero-sum games, with proven theoretical guarantees.
Findings
Achieves sublinear saddle-point regret in nonstationary settings
Demonstrates effectiveness through numerical simulations
Validates approach with a dynamic honeypot allocation case study
Abstract
Game theory provides essential analysis in many applications of strategic interactions. However, the question of how to construct a game model and what is its fidelity is seldom addressed. In this work, we consider learning in a class of repeated zero-sum games with unknown, time-varying payoff matrix, and noisy feedbacks, by making use of an ensemble of benchmark game models. These models can be pre-trained and collected dynamically during sequential plays. They serve as prior side information and imperfectly underpin the unknown true game model. We propose OFULinMat, an episodic learning algorithm that integrates the adaptive estimation of game models and the learning of the strategies. The proposed algorithm is shown to achieve a sublinear bound on the saddle-point regret. We show that this algorithm is provably efficient through both theoretical analysis and numerical examples. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Smart Grid Energy Management
