An Adaptive Algorithm for Finite Stochastic Partial Monitoring

Gabor Bartok (University of Alberta); Navid Zolghadr (University of; Alberta); Csaba Szepesvari (University of Alberta)

arXiv:1206.6487·cs.LG·July 3, 2012·ICML·23 cites

An Adaptive Algorithm for Finite Stochastic Partial Monitoring

Gabor Bartok (University of Alberta), Navid Zolghadr (University of, Alberta), Csaba Szepesvari (University of Alberta)

PDF

Open Access

TL;DR

This paper introduces an adaptive anytime algorithm for finite stochastic partial monitoring that achieves near-optimal regret across various problem complexities, including easy and hard instances, with specific benefits in dynamic pricing scenarios.

Contribution

The paper proposes a novel adaptive algorithm that attains near-optimal regret in finite stochastic partial monitoring, adapting to problem difficulty and opponent strategies.

Findings

01

Achieves minimax regret within logarithmic factors for all problem types.

02

Attains logarithmic individual regret for easy problems.

03

Demonstrates O(√T) regret in dynamic pricing under certain conditions.

Abstract

We present a new anytime algorithm that achieves near-optimal regret for any instance of finite stochastic partial monitoring. In particular, the new algorithm achieves the minimax regret, within logarithmic factors, for both "easy" and "hard" problems. For easy problems, it additionally achieves logarithmic individual regret. Most importantly, the algorithm is adaptive in the sense that if the opponent strategy is in an "easy region" of the strategy space then the regret grows as if the problem was easy. As an implication, we show that under some reasonable additional assumptions, the algorithm enjoys an O(\sqrt{T}) regret in Dynamic Pricing, proven to be hard by Bartok et al. (2011).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Gaussian Processes and Bayesian Inference