Fast and Sample Efficient Multi-Task Representation Learning in   Stochastic Contextual Bandits

Jiabin Lin; Shana Moothedath; Namrata Vaswani

arXiv:2410.02068·cs.LG·January 8, 2025

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

Jiabin Lin, Shana Moothedath, Namrata Vaswani

PDF

Open Access

TL;DR

This paper introduces a multi-task learning algorithm for stochastic contextual bandits that leverages shared low-rank representations to improve learning efficiency, supported by theoretical regret bounds and experimental comparisons.

Contribution

The paper proposes a novel algorithm combining alternating projected gradient descent and a minimization estimator to recover low-rank feature matrices in multi-task contextual bandits.

Findings

01

The algorithm achieves regret bounds that outperform traditional methods.

02

Experimental results show improved sample efficiency and performance over benchmarks.

03

The approach effectively leverages shared representations across multiple bandit tasks.

Abstract

We study how representation learning can improve the learning efficiency of contextual bandit problems. We study the setting where we play T contextual linear bandits with dimension d simultaneously, and these T bandit tasks collectively share a common linear representation with a dimensionality of r much smaller than d. We present a new algorithm based on alternating projected gradient descent (GD) and minimization estimator to recover a low-rank feature matrix. Using the proposed estimator, we present a multi-task learning algorithm for linear contextual bandits and prove the regret bound of our algorithm. We presented experiments and compared the performance of our algorithm against benchmark algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Data Stream Mining Techniques