Meta Representation Learning with Contextual Linear Bandits

Leonardo Cella; Karim Lounici; Massimiliano Pontil

arXiv:2205.15100·cs.LG·May 31, 2022

Meta Representation Learning with Contextual Linear Bandits

Leonardo Cella, Karim Lounici, Massimiliano Pontil

PDF

Open Access

TL;DR

This paper introduces a meta-learning approach for stochastic linear bandit tasks that leverages shared low-dimensional representations to efficiently learn new tasks, providing regret bounds and relaxing previous assumptions.

Contribution

It proposes a greedy policy for downstream tasks that estimates the shared representation, achieving near-optimal regret bounds without knowing the representation dimension.

Findings

01

Regret bound of order r√N(1∧√d/T) up to logarithmic factors.

02

Strategy does not require prior knowledge of the representation dimension r.

03

Achieves the same rate as optimal minimax algorithms when T > d.

Abstract

Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn a new downstream bandit task, which shares the same representation. Our principal contribution is to show that if the learned representation estimates well the unknown one, then the downstream task can be efficiently learned by a greedy policy that we propose in this work. We derive an upper bound on the regret of this policy, which is, up to logarithmic factors, of order $r N (1 \lor d / T)$ , where $N$ is the horizon of the downstream task, $T$ is the number of training tasks, $d$ the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning