Precise Asymptotics and Refined Regret of Variance-Aware UCB
Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong Xu, Zhengyuan Zhou

TL;DR
This paper provides a detailed asymptotic analysis of the variance-aware UCB-V algorithm for multi-armed bandits, revealing potential instability and offering refined regret bounds through non-asymptotic high probability analysis.
Contribution
It extends asymptotic results for UCB-V, uncovers its instability, and derives refined regret bounds using non-asymptotic high probability arm-pulling rate analysis.
Findings
UCB-V exhibits potential asymptotic instability in arm-pulling rates.
Non-asymptotic bounds provide high probability guarantees for arm-pulling.
Refined regret bounds are established for UCB-V, surpassing previous results.
Abstract
In this paper, we study the behavior of the Upper Confidence Bound-Variance (UCB-V) algorithm for the Multi-Armed Bandit (MAB) problems, a variant of the canonical Upper Confidence Bound (UCB) algorithm that incorporates variance estimates into its decision-making process. More precisely, we provide an asymptotic characterization of the arm-pulling rates for UCB-V, extending recent results for the canonical UCB in Kalvit and Zeevi (2021) and Khamaru and Zhang (2024). In an interesting contrast to the canonical UCB, our analysis reveals that the behavior of UCB-V can exhibit instability, meaning that the arm-pulling rates may not always be asymptotically deterministic. Besides the asymptotic characterization, we also provide non-asymptotic bounds for the arm-pulling rates in the high probability regime, offering insights into the regret analysis. As an application of this high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Metaheuristic Optimization Algorithms Research
