Adversarial Bandit Optimization with Globally Bounded Perturbations to Linear Losses
Zhuoyu Cheng, Kohei Hatano, Eiji Takimoto

TL;DR
This paper introduces a new adversarial bandit framework with globally bounded perturbations to linear losses, providing regret guarantees and extending classical bandit linear optimization results.
Contribution
It models perturbations constrained by a global budget, deriving regret bounds and a lower bound, thus advancing understanding of adversarial bandit problems with perturbations.
Findings
Established expected and high-probability regret guarantees.
Recovered improved regret bounds for classical bandit linear optimization.
Proved a lower bound on expected regret.
Abstract
We study a class of adversarial bandit optimization problems in which the loss functions may be non-convex and non-smooth. In each round, the learner observes a loss that consists of an underlying linear component together with an additional perturbation applied after the learner selects an action. The perturbations are measured relative to the linear losses and are constrained by a global budget that bounds their cumulative magnitude over time. Under this model, we establish both expected and high-probability regret guarantees. As a special case of our analysis, we recover an improved high-probability regret bound for classical bandit linear optimization, which corresponds to the setting without perturbations. We further complement our upper bounds by proving a lower bound on the expected regret.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
