Policy Regret in Repeated Games

Raman Arora; Michael Dinitz; Teodor V. Marinov; Mehryar Mohri

arXiv:1811.04127·cs.LG·March 24, 2020·5 cites

Policy Regret in Repeated Games

Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri

PDF

Open Access

TL;DR

This paper explores the concept of policy regret in online learning, showing its differences from external regret, its implications in game theory, and introducing the new notion of policy equilibrium.

Contribution

It demonstrates the incompatibility of policy and external regret in some settings, and shows their alignment in game-theoretic contexts, introducing policy equilibrium as a new concept.

Findings

01

Policy regret and external regret can be incompatible in certain online learning scenarios.

02

In game-theoretic settings, algorithms can ensure low regret for both measures simultaneously.

03

Policy equilibria encompass coarse correlated equilibria, with implications for learning dynamics.

Abstract

The notion of \emph{policy regret} in online learning is a well defined? performance measure for the common scenario of adaptive adversaries, which more traditional quantities such as external regret do not take into account. We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other. We then focus on the game-theoretic setting where the adversary is a self-interested agent. In that setting, we show that external regret and policy regret are not in conflict and, in fact, that a wide class of algorithms can ensure a favorable regret with respect to both definitions, so long as the adversary is also using such an algorithm. We also show that the sequence of play of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Misinformation and Its Impacts · Game Theory and Applications