PFN-TS: Thompson Sampling for Contextual Bandits via Prior-Data Fitted Networks

Yan Shuo Tan; Kenyon Ng; Ruizhe Deng; Sumetha Loganathan; Qiong Zhang; Bibhas Chakraborty

arXiv:2605.10137·stat.ML·May 12, 2026

PFN-TS: Thompson Sampling for Contextual Bandits via Prior-Data Fitted Networks

Yan Shuo Tan, Kenyon Ng, Ruizhe Deng, Sumetha Loganathan, Qiong Zhang, Bibhas Chakraborty

PDF

1 Repo

TL;DR

PFN-TS introduces a novel Thompson sampling method that leverages prior-data fitted networks to efficiently approximate Bayesian posteriors in contextual bandits, improving empirical performance and theoretical guarantees.

Contribution

It develops PFN-TS, a new Thompson sampling algorithm that converts PFN posterior predictives into mean-reward samples using a subsampled variance estimator, with proven consistency and regret bounds.

Findings

01

PFN-TS achieves top average rank on synthetic and OpenML benchmarks.

02

It remains competitive on linear and BART-generated rewards.

03

PFN-TS attains highest estimated policy value in offline mobile-health evaluation.

Abstract

Thompson sampling is a widely used strategy for contextual bandits: at each round, it samples a reward function from a Bayesian posterior and acts greedily under that sample. Prior-data fitted networks (PFNs), such as TabPFN v2+ and TabICL v2, are attractive candidates for this purpose because they approximate Bayesian posterior predictive distributions in a single forward pass. However, PFNs predict noisy future rewards, while Thompson sampling requires uncertainty over the latent mean reward function. We propose PFN-TS, a Thompson sampling algorithm that converts PFN posterior predictives into mean-reward samples using a subsampled predictive central limit theorem. The method estimates posterior variance from a geometric grid of $O (lo g n)$ dataset prefixes rather than the full $O (n)$ predictive sequence used in previous predictive-sequence approaches, and reuses TabICL's cached…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anonymous/4open.science/r/PFN_TS-36ED
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.