The Elliptical Potential Lemma for General Distributions with an   Application to Linear Thompson Sampling

Nima Hamidi; Mohsen Bayati

arXiv:2102.07987·stat.ML·January 20, 2022·1 cites

The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling

Nima Hamidi, Mohsen Bayati

PDF

Open Access

TL;DR

This paper extends the elliptical potential lemma to non-Gaussian noise and priors in linear bandits, enabling improved regret bounds for Thompson sampling with general distributions.

Contribution

It introduces a generalized elliptical potential lemma that relaxes Gaussian assumptions, broadening its applicability in sequential learning algorithms.

Findings

01

Provides a non-Gaussian elliptical potential lemma.

02

Proves an improved Bayesian regret bound for Thompson sampling.

03

Achieves minimax optimal regret bounds up to constants.

Abstract

In this note, we introduce a general version of the well-known elliptical potential lemma that is a widely used technique in the analysis of algorithms in sequential learning and decision-making problems. We consider a stochastic linear bandit setting where a decision-maker sequentially chooses among a set of given actions, observes their noisy rewards, and aims to maximize her cumulative expected reward over a decision-making horizon. The elliptical potential lemma is a key tool for quantifying uncertainty in estimating parameters of the reward function, but it requires the noise and the prior distributions to be Gaussian. Our general elliptical potential lemma relaxes this Gaussian requirement which is a highly non-trivial extension for a number of reasons; unlike the Gaussian case, there is no closed-form solution for the covariance matrix of the posterior distribution, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics · Advanced Causal Inference Techniques