Leveraging Offline Data in Linear Latent Contextual Bandits

Chinmaya Kausik; Kevin Tan; Ambuj Tewari

arXiv:2405.17324·cs.LG·September 3, 2025

Leveraging Offline Data in Linear Latent Contextual Bandits

Chinmaya Kausik, Kevin Tan, Ambuj Tewari

PDF

Open Access

TL;DR

This paper introduces algorithms for linear latent contextual bandits that effectively leverage offline data to improve online decision-making, with theoretical guarantees and practical validation on real datasets.

Contribution

It proposes the first end-to-end algorithms for linear latent bandits that handle uncountably many latent states, including an offline subspace learning method and two online algorithms with optimal regret bounds.

Findings

01

Offline subspace learning with provable guarantees

02

Online algorithms with minimax optimal regret bounds

03

Validated effectiveness on synthetic and real-world data

Abstract

Leveraging offline data is an attractive way to accelerate online sequential decision-making. However, it is crucial to account for latent states in users or environments in the offline data, and latent bandits form a compelling model for doing so. In this light, we design end-to-end latent bandit algorithms capable of handing uncountably many latent states. We focus on a linear latent contextual bandit $-$ a linear bandit where each user has its own high-dimensional reward parameter in $R^{d_{A}}$ , but reward parameters across users lie in a low-rank latent subspace of dimension $d_{K} ≪ d_{A}$ . First, we provide an offline algorithm to learn this subspace with provable guarantees. We then present two online algorithms that utilize the output of this offline algorithm to accelerate online learning. The first enjoys $\tilde{O} (min (d_{A} T, d_{K} T (1 + d_{A} T / d_{K} N)))$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Machine Learning and Data Classification

MethodsFocus