Adversarial bandit optimization for approximately linear functions
Zhuoyu Cheng, Kohei Hatano, Eiji Takimoto

TL;DR
This paper introduces an adversarial bandit optimization framework for approximately linear functions, providing regret bounds and a lower bound, advancing understanding of nonconvex, non-smooth bandit problems.
Contribution
It develops regret bounds for a new class of bandit problems with perturbations and improves bounds for bandit linear optimization, also establishing a lower bound on expected regret.
Findings
Expected and high-probability regret bounds derived.
Improved high-probability regret bound for bandit linear optimization.
Lower bound on expected regret established.
Abstract
We consider a bandit optimization problem for nonconvex and non-smooth functions, where in each trial the loss function is the sum of a linear function and a small but arbitrary perturbation chosen after observing the player's choice. We give both expected and high probability regret bounds for the problem. Our result also implies an improved high-probability regret bound for the bandit linear optimization, a special case with no perturbation. We also give a lower bound on the expected regret.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Stochastic Gradient Optimization Techniques
