On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure
Alessio Russo, Alexandre Proutiere

TL;DR
This paper analyzes the sample complexity of learning optimal representations and predictors in multi-task bandit problems, proposing an algorithm that nearly matches lower bounds and significantly improves over classical methods.
Contribution
It introduces a new framework for multi-task bandits with shared representations, deriving lower bounds and presenting an efficient algorithm with near-optimal sample complexity.
Findings
The derived lower bounds apply to any PAC algorithm in this setting.
The proposed OSRL-SC algorithm approaches the lower bounds in sample complexity.
OSRL-SC scales better than classical methods, especially with many tasks and predictors.
Abstract
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems. Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor). The objective is to learn the optimal (representation, predictor)-pair for each task, under the assumption that the optimal representation is common to all tasks. Within this framework, efficient learning algorithms should transfer knowledge across tasks. We consider the best-arm identification problem for a fixed confidence, where, in each round, the learner actively selects both a task, and an arm, and observes the corresponding reward. We derive instance-specific sample complexity lower bounds satisfied by any -PAC algorithm (such an algorithm identifies the best representation with probability at least , and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
