Non-Stationary Bandit Convex Optimization: An Optimal Algorithm with Two-Point Feedback

Chang He; Bo Jiang; Shuzhong Zhang

arXiv:2508.04654·math.OC·September 9, 2025

Non-Stationary Bandit Convex Optimization: An Optimal Algorithm with Two-Point Feedback

Chang He, Bo Jiang, Shuzhong Zhang

PDF

TL;DR

This paper introduces an optimal algorithm for non-stationary bandit convex optimization with two-point feedback, achieving near-optimal dynamic regret bounds in Euclidean and non-Euclidean settings, including the simplex and cross-polytope.

Contribution

It extends bandit mirror descent to non-stationary environments with two-point feedback, providing nearly optimal regret bounds in various geometric settings.

Findings

01

Achieves optimal regret bounds in Euclidean space matching previous lower bounds.

02

Extends to non-Euclidean settings like the simplex and cross-polytope with near-optimal bounds.

03

Improves upon previous work by a factor of in Euclidean space.

Abstract

This paper studies bandit convex optimization in non-stationary environments with two-point feedback, using dynamic regret as the performance measure. We propose an algorithm based on bandit mirror descent that extends naturally to non-Euclidean settings. Let $T$ be the total number of iterations and $P_{T, p}$ the path variation with respect to the $ℓ_{p}$ -norm. In Euclidean space, our algorithm matches the optimal regret bound $O (d T (1 + P_{T, 2}))$ , improving upon \citet{zhao2021bandit} by a factor of $O (d)$ . Beyond Euclidean settings, our algorithm achieves an upper bound of $O (d lo g (d) T lo g (T) (1 + P_{T, 1}))$ on the simplex, which is nearly optimal up to log factors. For the cross-polytope, the bound reduces to $O (d lo g (d) T (1 + P_{T, p}))$ for some $p = 1 + 1/ lo g (d)$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.