Meta Representation Learning with Contextual Linear Bandits
Leonardo Cella, Karim Lounici, Massimiliano Pontil

TL;DR
This paper introduces a meta-learning approach for stochastic linear bandit tasks that leverages shared low-dimensional representations to efficiently learn new tasks, providing regret bounds and relaxing previous assumptions.
Contribution
It proposes a greedy policy for downstream tasks that estimates the shared representation, achieving near-optimal regret bounds without knowing the representation dimension.
Findings
Regret bound of order r√N(1∧√d/T) up to logarithmic factors.
Strategy does not require prior knowledge of the representation dimension r.
Achieves the same rate as optimal minimax algorithms when T > d.
Abstract
Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn a new downstream bandit task, which shares the same representation. Our principal contribution is to show that if the learned representation estimates well the unknown one, then the downstream task can be efficiently learned by a greedy policy that we propose in this work. We derive an upper bound on the regret of this policy, which is, up to logarithmic factors, of order , where is the horizon of the downstream task, is the number of training tasks, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
