SCaLE: Switching Cost aware Learning and Exploration
Neelkamal Bhuyan, Debankur Mukherjee, Adam Wierman

TL;DR
This paper introduces SCaLE, a novel algorithm for high-dimensional bandit convex optimization that effectively manages switching costs and achieves sub-linear regret without prior environment knowledge.
Contribution
The work presents the first algorithm with provable sub-linear regret in high-dimensional bandit settings considering switching costs, along with a new spectral regret analysis method.
Findings
SCaLE achieves distribution-agnostic sub-linear dynamic regret.
Spectral analysis separates eigenvalue and eigenbasis contributions to regret.
Numerical experiments confirm SCaLE's statistical consistency and effectiveness.
Abstract
This work addresses the fundamental problem of unbounded metric movement costs in bandit online convex optimization, by considering high-dimensional dynamic quadratic hitting costs and -norm switching costs in a noisy bandit feedback model. For a general class of stochastic environments, we provide the first algorithm SCaLE that provably achieves a distribution-agnostic sub-linear dynamic regret, without the knowledge of hitting cost structure. En-route, we present a novel spectral regret analysis that separately quantifies eigenvalue-error driven regret and eigenbasis-perturbation driven regret. Extensive numerical experiments, against online-learning baselines, corroborate our claims, and highlight statistical consistency of our algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
