Second Order Regret Bounds Against Generalized Expert Sequences under Partial Bandit Feedback
Kaan Gokcesu, Hakan Gokcesu

TL;DR
This paper introduces a minimax optimal online algorithm for expert advice with partial bandit feedback, achieving second order regret bounds in a general adversarial setting with broad applicability.
Contribution
It presents a novel universal prediction algorithm that handles general partial monitoring and adversarial loss revelation, with second order regret guarantees.
Findings
Achieves second order regret bounds based on squared losses.
Invariance of normalized regret under affine transformations.
Algorithm is fully online without prior loss sequence information.
Abstract
We study the problem of expert advice under partial bandit feedback setting and create a sequential minimax optimal algorithm. Our algorithm works with a more general partial monitoring setting, where, in contrast to the classical bandit feedback, the losses can be revealed in an adversarial manner. Our algorithm adopts a universal prediction perspective, whose performance is analyzed with regret against a general expert selection sequence. The regret we study is against a general competition class that covers many settings (such as the switching or contextual experts settings) and the expert selection sequences in the competition class are determined by the application at hand. Our regret bounds are second order bounds in terms of the sum of squared losses and the normalized regret of our algorithm is invariant under arbitrary affine transforms of the loss sequence. Our algorithm is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Distributed Sensor Networks and Detection Algorithms
