Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Xuheng Li, Quanquan Gu

TL;DR
This paper introduces FGTSVA, a variance-aware Thompson Sampling algorithm for contextual bandits with general reward functions, achieving optimal regret bounds and extending the decoupling coefficient technique.
Contribution
The paper develops FGTSVA, a novel variance-aware Thompson Sampling method with optimal regret bounds for general reward functions in contextual bandits.
Findings
FGTSVA achieves regret bounds of O(\u221A( ext{dc} imes \u007F\u007F ext{log}|\u2713| imes extstyle\sum_{t=1}^T \sigma_t^2) + ext{dc})
In linear bandits, FGTSVA's regret matches that of weighted linear regression algorithms.
The extension of the decoupling coefficient 1 provides a new analytical tool for variance-aware bandit algorithms.
Abstract
Variance-dependent regret bounds have received increasing attention in recent studies on contextual bandits. However, most of these studies are focused on upper confidence bound (UCB)-based bandit algorithms, while sampling based bandit algorithms such as Thompson sampling are still understudied. The only exception is the LinVDTS algorithm (Xu et al., 2023), which is limited to linear reward function and its regret bound is not optimal with respect to the model dimension. In this paper, we present FGTSVA, a variance-aware Thompson Sampling algorithm for contextual bandits with general reward function with optimal regret bound. At the core of our analysis is an extension of the decoupling coefficient, a technique commonly used in the analysis of Feel-good Thompson sampling (FGTS) that reflects the complexity of the model space. With the new decoupling coefficient denoted by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Recommender Systems and Techniques
