Optimal Algorithms for Latent Bandits with Cluster Structure
Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

TL;DR
This paper introduces LATTICE, an optimal algorithm for latent clustered bandits that leverages cluster structure to achieve near-optimal regret bounds, improving efficiency and performance in recommendation systems.
Contribution
The paper presents the first algorithm with minimax optimal regret for latent clustered bandits, combining matrix completion with clustering for efficiency.
Findings
Achieves regret of (+)(T) with (1) clusters.
Requires only O((T)) calls to matrix completion oracle.
First algorithm to guarantee such strong regret bounds for latent cluster bandits.
Abstract
We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal of the users is to maximize their cumulative rewards. This problem is central to practical recommendation systems and has received wide attention of late \cite{gentile2014online, maillard2014latent}. Now, if each user acts independently, then they would have to explore each arm independently and a regret of is unavoidable, where are the number of arms and users, respectively. Instead, we propose LATTICE (Latent bAndiTs via maTrIx…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Age of Information Optimization
