Multi-task Representation Learning for Pure Exploration in Linear   Bandits

Yihan Du; Longbo Huang; Wen Sun

arXiv:2302.04441·cs.LG·May 31, 2023

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Yihan Du, Longbo Huang, Wen Sun

PDF

Open Access 1 Video

TL;DR

This paper introduces novel multi-task representation learning algorithms for pure exploration in linear bandits, significantly reducing sample complexity by leveraging shared low-dimensional representations across tasks.

Contribution

It proposes the first algorithms demonstrating how shared representations can accelerate pure exploration in multi-task linear bandit problems.

Findings

01

Sample complexity is significantly improved over independent task solutions.

02

Algorithms DouExpDes and C-DouExpDes effectively plan optimal sample allocations.

03

First demonstration of representation learning benefits in multi-task pure exploration.

Abstract

Despite the recent success of representation learning in sequential decision making, the study of the pure exploration scenario (i.e., identify the best option and minimize the sample complexity) is still limited. In this paper, we study multi-task representation learning for best arm identification in linear bandits (RepBAI-LB) and best policy identification in contextual linear bandits (RepBPI-CLB), two popular pure exploration settings with wide applications, e.g., clinical trials and web content optimization. In these two problems, all tasks share a common low-dimensional linear representation, and our goal is to leverage this feature to accelerate the best arm (policy) identification process for all tasks. For these problems, we design computationally and sample efficient algorithms DouExpDes and C-DouExpDes, which perform double experimental designs to plan optimal sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multi-task Representation Learning for Pure Exploration in Linear Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research