Thompson Sampling for (Combinatorial) Pure Exploration

Siwei Wang; Jun Zhu

arXiv:2206.09150·cs.LG·June 22, 2022·1 cites

Thompson Sampling for (Combinatorial) Pure Exploration

Siwei Wang, Jun Zhu

PDF

Open Access

TL;DR

This paper introduces TS-Explore, a Thompson Sampling-based algorithm for combinatorial pure exploration that reduces complexity by using independent samples, outperforming existing UCB-based methods.

Contribution

The paper presents the first Thompson Sampling-based algorithm for combinatorial pure exploration, achieving lower complexity bounds than UCB-based algorithms.

Findings

01

TS-Explore achieves lower complexity upper bounds.

02

TS-Explore is asymptotically optimal for classic multi-armed bandit pure exploration.

03

The method effectively uses independent samples to tighten confidence bounds.

Abstract

Existing methods of combinatorial pure exploration mainly focus on the UCB approach. To make the algorithm efficient, they usually use the sum of upper confidence bounds within arm set $S$ to represent the upper confidence bound of $S$ , which can be much larger than the tight upper confidence bound of $S$ and leads to a much higher complexity than necessary, since the empirical means of different arms in $S$ are independent. To deal with this challenge, we explore the idea of Thompson Sampling (TS) that uses independent random samples instead of the upper confidence bounds, and design the first TS-based algorithm TS-Explore for (combinatorial) pure exploration. In TS-Explore, the sum of independent random samples within arm set $S$ will not exceed the tight upper confidence bound of $S$ with high probability. Hence it solves the above challenge, and achieves a lower complexity upper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems