Adversarial bandit optimization for approximately linear functions

Zhuoyu Cheng; Kohei Hatano; Eiji Takimoto

arXiv:2505.20734·cs.LG·January 7, 2026

Adversarial bandit optimization for approximately linear functions

Zhuoyu Cheng, Kohei Hatano, Eiji Takimoto

PDF

Open Access

TL;DR

This paper introduces an adversarial bandit optimization framework for approximately linear functions, providing regret bounds and a lower bound, advancing understanding of nonconvex, non-smooth bandit problems.

Contribution

It develops regret bounds for a new class of bandit problems with perturbations and improves bounds for bandit linear optimization, also establishing a lower bound on expected regret.

Findings

01

Expected and high-probability regret bounds derived.

02

Improved high-probability regret bound for bandit linear optimization.

03

Lower bound on expected regret established.

Abstract

We consider a bandit optimization problem for nonconvex and non-smooth functions, where in each trial the loss function is the sum of a linear function and a small but arbitrary perturbation chosen after observing the player's choice. We give both expected and high probability regret bounds for the problem. Our result also implies an improved high-probability regret bound for the bandit linear optimization, a special case with no perturbation. We also give a lower bound on the expected regret.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Stochastic Gradient Optimization Techniques