Bayesian Algorithms for Adversarial Online Learning: from Finite to Infinite Action Spaces
Alexander Terenin, Jeffrey Negrea

TL;DR
This paper introduces a Bayesian Thompson sampling approach for adversarial online learning, extending to infinite action spaces, and demonstrates near-optimal regret bounds in high-dimensional continuous settings.
Contribution
It develops a novel Thompson sampling framework that models the adversary's actions directly, achieving regret bounds in infinite action spaces using Gaussian process priors.
Findings
Recovers optimal rates in finite expert settings.
Provides regret bounds for continuous action spaces with Gaussian process priors.
Extends Bayesian online learning to uncountably infinite action spaces.
Abstract
We develop a form Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner's prior is defined over the space of an adversary's future actions, rather than the space of experts. We show regret decomposes into regret the learner expected a priori, plus a prior-robustness-type term we call excess regret. In the classical finite-expert setting, this recovers optimal rates. As an initial step towards practical online learning in settings with a potentially-uncountably-infinite number of experts, we show that Thompson sampling over the -dimensional unit cube, using a certain Gaussian process prior widely-used in the Bayesian optimization literature, has a rate against a -bounded -Lipschitz adversary.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
MethodsGaussian Process
