Thompson Sampling via Local Uncertainty

Zhendong Wang; Mingyuan Zhou

arXiv:1910.13673·stat.ML·August 7, 2020

Thompson Sampling via Local Uncertainty

Zhendong Wang, Mingyuan Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel Thompson sampling approach that leverages local latent variable uncertainty with variational inference, achieving state-of-the-art results in contextual bandit tasks with low computational cost.

Contribution

It proposes a new probabilistic framework using local uncertainty for Thompson sampling, enhancing expressiveness with semi-implicit variational inference.

Findings

01

Achieves state-of-the-art performance on eight benchmark datasets.

02

Maintains low computational complexity compared to existing methods.

03

Utilizes local latent variable uncertainty for improved exploration.

Abstract

Thompson sampling is an efficient algorithm for sequential decision making, which exploits the posterior uncertainty to address the exploration-exploitation dilemma. There has been significant recent interest in integrating Bayesian neural networks into Thompson sampling. Most of these methods rely on global variable uncertainty for exploration. In this paper, we propose a new probabilistic modeling framework for Thompson sampling, where local latent variable uncertainty is used to sample the mean reward. Variational inference is used to approximate the posterior of the local variable, and semi-implicit structure is further introduced to enhance its expressiveness. Our experimental results on eight contextual bandit benchmark datasets show that Thompson sampling guided by local uncertainty achieves state-of-the-art performance while having low computational complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zhendong-Wang/Thompson-Sampling-via-Local-Uncertainty
tfOfficial

Videos

Thompson Sampling via Local Uncertainty· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms