Beyond Task Diversity: Provable Representation Transfer for Sequential Multi-Task Linear Bandits
Thang Duong, Zhi Wang, and Chicheng Zhang

TL;DR
This paper introduces a new algorithm for sequential multi-task linear bandits that learns low-rank representations without requiring task diversity, improving regret bounds and demonstrating empirical advantages over existing methods.
Contribution
First to address sequential multi-task linear bandits without assuming task diversity, providing an efficient algorithm with provable regret guarantees.
Findings
Achieves regret of (Nm + N^{2/3} au^{2/3} d m^{1/3} + Nd^2 + au m d)
Outperforms baseline algorithms on synthetic data
Removes the unrealistic task diversity assumption in lifelong learning for linear bandits
Abstract
We study lifelong learning in linear bandits, where a learner interacts with a sequence of linear bandit tasks whose parameters lie in an -dimensional subspace of , thereby sharing a low-rank representation. Current literature typically assumes that the tasks are diverse, i.e., their parameters uniformly span the -dimensional subspace. This assumption allows the low-rank representation to be learned before all tasks are revealed, which can be unrealistic in real-world applications. In this work, we present the first nontrivial result for sequential multi-task linear bandits without the task diversity assumption. We develop an algorithm that efficiently learns and transfers low-rank representations. When facing tasks, each played over rounds, our algorithm achieves a regret guarantee of $\tilde{O}\big (Nm \sqrt{\tau} + N^{\frac{2}{3}} \tau^{\frac{2}{3}} d…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Auction Theory and Applications
MethodsSparse Evolutionary Training
