Contextual Thompson Sampling via Generation of Missing Data

Kelly W. Zhang; Tiffany Tianhui Cai; Hongseok Namkoong; Daniel Russo

arXiv:2502.07064·cs.LG·November 13, 2025

Contextual Thompson Sampling via Generation of Missing Data

Kelly W. Zhang, Tiffany Tianhui Cai, Hongseok Namkoong, Daniel Russo

PDF

Open Access 1 Video

TL;DR

This paper presents a novel contextual Thompson sampling framework that leverages generative models to impute missing data, enabling better uncertainty quantification and decision-making in bandit problems, with proven regret bounds.

Contribution

It introduces a generative model-based approach for TS in contextual bandits, providing a formal regret analysis that depends on offline prediction quality.

Findings

01

Regret bounds depend on generative model's offline prediction loss.

02

Algorithm effectively imputes missing outcomes for improved decision-making.

03

Framework achieves state-of-the-art regret guarantees.

Abstract

We introduce a framework for Thompson sampling (TS) contextual bandit algorithms, in which the algorithm's ability to quantify uncertainty and make decisions depends on the quality of a generative model that is learned offline. Instead of viewing uncertainty in the environment as arising from unobservable latent parameters, our algorithm treats uncertainty as stemming from missing, but potentially observable outcomes (including both future and counterfactual outcomes). If these outcomes were all observed, one could simply make decisions using an "oracle" policy fit on the complete dataset. Inspired by this conceptualization, at each decision-time, our algorithm uses a generative model to probabilistically impute missing outcomes, fits a policy using the imputed complete dataset, and uses that policy to select the next action. We formally show that this algorithm is a generative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Contextual Thompson Sampling via Generation of Missing Data· slideslive

Taxonomy

TopicsSurvey Sampling and Estimation Techniques · Bayesian Methods and Mixture Models · Machine Learning and Algorithms