Precise Asymptotics and Refined Regret of Variance-Aware UCB

Yingying Fan; Yuxuan Han; Jinchi Lv; Xiaocong Xu; Zhengyuan Zhou

arXiv:2412.08843·stat.ML·February 18, 2025

Precise Asymptotics and Refined Regret of Variance-Aware UCB

Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong Xu, Zhengyuan Zhou

PDF

Open Access

TL;DR

This paper provides a detailed asymptotic analysis of the variance-aware UCB-V algorithm for multi-armed bandits, revealing potential instability and offering refined regret bounds through non-asymptotic high probability analysis.

Contribution

It extends asymptotic results for UCB-V, uncovers its instability, and derives refined regret bounds using non-asymptotic high probability arm-pulling rate analysis.

Findings

01

UCB-V exhibits potential asymptotic instability in arm-pulling rates.

02

Non-asymptotic bounds provide high probability guarantees for arm-pulling.

03

Refined regret bounds are established for UCB-V, surpassing previous results.

Abstract

In this paper, we study the behavior of the Upper Confidence Bound-Variance (UCB-V) algorithm for the Multi-Armed Bandit (MAB) problems, a variant of the canonical Upper Confidence Bound (UCB) algorithm that incorporates variance estimates into its decision-making process. More precisely, we provide an asymptotic characterization of the arm-pulling rates for UCB-V, extending recent results for the canonical UCB in Kalvit and Zeevi (2021) and Khamaru and Zhang (2024). In an interesting contrast to the canonical UCB, our analysis reveals that the behavior of UCB-V can exhibit instability, meaning that the arm-pulling rates may not always be asymptotically deterministic. Besides the asymptotic characterization, we also provide non-asymptotic bounds for the arm-pulling rates in the high probability regime, offering insights into the regret analysis. As an application of this high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Metaheuristic Optimization Algorithms Research