Non-stationary Bandit Convex Optimization: A Comprehensive Study

Xiaoqi Liu; Dorian Baudry; Julian Zimmert; Patrick Rebeschini; Arya Akhavan

arXiv:2506.02980·stat.ML·December 2, 2025

Non-stationary Bandit Convex Optimization: A Comprehensive Study

Xiaoqi Liu, Dorian Baudry, Julian Zimmert, Patrick Rebeschini, Arya Akhavan

PDF

Open Access

TL;DR

This paper studies non-stationary bandit convex optimization, proposing algorithms that adapt to environment changes and achieve minimax-optimal regret bounds under various measures of non-stationarity.

Contribution

It introduces TEWA-SE, a polynomial-time algorithm for strongly convex losses, and cExO, a non-polynomial method for general convex losses, both achieving optimal regret bounds in non-stationary settings.

Findings

01

TEWA-SE is minimax-optimal for strongly convex losses with known non-stationarity measures.

02

cExO achieves minimax-optimal regret for general convex losses, improving bounds related to path-length.

03

Algorithms adapt to unknown non-stationarity measures using the Bandit-over-Bandit framework.

Abstract

Bandit Convex Optimization is a fundamental class of sequential decision-making problems, where the learner selects actions from a continuous domain and observes a loss (but not its gradient) at only one point per round. We study this problem in non-stationary environments, and aim to minimize the regret under three standard measures of non-stationarity: the number of switches $S$ in the comparator sequence, the total variation $Δ$ of the loss functions, and the path-length $P$ of the comparator sequence. We propose a polynomial-time algorithm, Tilted Exponentially Weighted Average with Sleeping Experts (TEWA-SE), which adapts the sleeping experts framework from online convex optimization to the bandit setting. For strongly convex losses, we prove that TEWA-SE is minimax-optimal with respect to known $S$ and $Δ$ by establishing matching upper and lower bounds. By equipping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management