On Adaptivity in Non-stationary Stochastic Optimization With Bandit Feedback
Yining Wang

TL;DR
This paper introduces a new bandit optimization algorithm that achieves optimal dynamic regret in non-stationary environments without prior knowledge of function changes, advancing adaptive strategies in stochastic optimization.
Contribution
The authors develop a fixed step size stochastic optimization algorithm combined with multi-scale sampling that attains optimal dynamic regret without prior change information, and show how stationary regret algorithms can adapt to dynamic benchmarks.
Findings
Achieves optimal dynamic regret without prior knowledge of change budget.
Extends to converting stationary regret algorithms into dynamic regret algorithms.
Provides theoretical guarantees for adaptive bandit convex optimization.
Abstract
In this paper we study the non-stationary stochastic optimization question with bandit feedback and dynamic regret measures. The seminal work of Besbes et al. (2015) shows that, when aggregated function changes is known a priori, a simple re-starting algorithm attains the optimal dynamic regret. In this work, we designed a stochastic optimization algorithm with fixed step sizes, which combined together with the multi-scale sampling framework of Wei and Luo (2021) achieves the optimal dynamic regret in non-stationary stochastic optimization without requiring prior knowledge of function change budget, thereby closes a question that has been open for a while. We also establish an additional result showing that any algorithm achieving good regret against stationary benchmarks with high probability could be automatically converted to an algorithm that achieves good regret against dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
