Toward a Classification of Finite Partial-Monitoring Games

Andr\'as Antos; G\'abor Bart\'ok; D\'avid P\'al; Csaba Szepesv\'ari

arXiv:1102.2041·cs.GT·October 13, 2011·2 cites

Toward a Classification of Finite Partial-Monitoring Games

Andr\'as Antos, G\'abor Bart\'ok, D\'avid P\'al, Csaba Szepesv\'ari

PDF

Open Access

TL;DR

None

Contribution

None

Abstract

Partial-monitoring games constitute a mathematical framework for sequential decision making problems with imperfect feedback: The learner repeatedly chooses an action, opponent responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his total cumulative loss. We make progress towards the classification of these games based on their minimax expected regret. Namely, we classify almost all games with two outcomes and finite number of actions: We show that their minimax expected regret is either zero, $Θ (T)$ , $Θ (T^{2/3})$ , or $Θ (T)$ and we give a simple and efficiently computable classification of these four classes of games. Our hope is that the result can serve as a stepping stone toward classifying all finite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics