Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
Ha Manh Bui, Enrique Mallada, Anqi Liu

TL;DR
This paper introduces a variance-aware neural UCB algorithm that leverages deep neural networks and variance estimation to improve exploration, leading to better regret bounds and empirical performance in neural contextual bandits.
Contribution
It proposes a novel variance-aware neural UCB algorithm with theoretical regret guarantees and practical variance estimation, outperforming existing methods.
Findings
Achieves better regret bounds than previous neural-UCB algorithms.
Outperforms state-of-the-art methods on multiple datasets.
Provides a practical variance estimation method with similar computational efficiency.
Abstract
By leveraging the representation power of deep neural networks, neural upper confidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural--LinearUCB, a variance-aware algorithm that utilizes , i.e., an upper bound of the reward noise variance at round , to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound and a practical version with a novel estimation for this variance bound. Theoretically, we provide rigorous regret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Empirically, our practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Adaptive Dynamic Programming Control · Cognitive Radio Networks and Spectrum Sensing
