Best-of-Both-Worlds Algorithms for Partial Monitoring

Taira Tsuchiya; Shinji Ito; Junya Honda

arXiv:2207.14550·cs.LG·October 11, 2022

Best-of-Both-Worlds Algorithms for Partial Monitoring

Taira Tsuchiya, Shinji Ito, Junya Honda

PDF

Open Access

TL;DR

This paper introduces the first algorithms that perform optimally in both stochastic and adversarial partial monitoring settings, with regret bounds tailored to game observability and complexity.

Contribution

It presents novel best-of-both-worlds algorithms for partial monitoring, with regret bounds for both stochastic and adversarial regimes based on game observability.

Findings

01

Regret bounds for non-degenerate locally observable games in stochastic and adversarial regimes.

02

Regret bounds for globally observable games in stochastic and adversarial regimes.

03

Algorithms are based on follow-the-regularized-leader with adaptive learning rates.

Abstract

This study considers the partial monitoring problem with $k$ -actions and $d$ -outcomes and provides the first best-of-both-worlds algorithms, whose regrets are favorably bounded both in the stochastic and adversarial regimes. In particular, we show that for non-degenerate locally observable games, the regret is $O (m^{2} k^{4} lo g (T) lo g (k_{Π} T) / Δ_{m i n})$ in the stochastic regime and $O (m k^{2/3} T lo g (T) lo g k_{Π})$ in the adversarial regime, where $T$ is the number of rounds, $m$ is the maximum number of distinct observations per action, $Δ_{m i n}$ is the minimum suboptimality gap, and $k_{Π}$ is the number of Pareto optimal actions. Moreover, we show that for globally observable games, the regret is $O (c_{G}^{2} lo g (T) lo g (k_{Π} T) / Δ_{m i n}^{2})$ in the stochastic regime and $O ((c_{G}^{2} lo g (T) lo g (k_{Π} T))^{1/3} T^{2/3})$ in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics