Non-stationary Bandit Convex Optimization: A Comprehensive Study
Xiaoqi Liu, Dorian Baudry, Julian Zimmert, Patrick Rebeschini, Arya Akhavan

TL;DR
This paper studies non-stationary bandit convex optimization, proposing algorithms that adapt to environment changes and achieve minimax-optimal regret bounds under various measures of non-stationarity.
Contribution
It introduces TEWA-SE, a polynomial-time algorithm for strongly convex losses, and cExO, a non-polynomial method for general convex losses, both achieving optimal regret bounds in non-stationary settings.
Findings
TEWA-SE is minimax-optimal for strongly convex losses with known non-stationarity measures.
cExO achieves minimax-optimal regret for general convex losses, improving bounds related to path-length.
Algorithms adapt to unknown non-stationarity measures using the Bandit-over-Bandit framework.
Abstract
Bandit Convex Optimization is a fundamental class of sequential decision-making problems, where the learner selects actions from a continuous domain and observes a loss (but not its gradient) at only one point per round. We study this problem in non-stationary environments, and aim to minimize the regret under three standard measures of non-stationarity: the number of switches in the comparator sequence, the total variation of the loss functions, and the path-length of the comparator sequence. We propose a polynomial-time algorithm, Tilted Exponentially Weighted Average with Sleeping Experts (TEWA-SE), which adapts the sleeping experts framework from online convex optimization to the bandit setting. For strongly convex losses, we prove that TEWA-SE is minimax-optimal with respect to known and by establishing matching upper and lower bounds. By equipping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management
