On the optimal regret of collaborative personalized linear bandits

Bruce Huang; Ruida Zhou; Lin F. Yang; Suhas Diggavi

arXiv:2506.15943·cs.LG·June 23, 2025

On the optimal regret of collaborative personalized linear bandits

Bruce Huang, Ruida Zhou, Lin F. Yang, Suhas Diggavi

PDF

Open Access

TL;DR

This paper characterizes the optimal regret bounds in collaborative personalized linear bandits, revealing how collaboration, heterogeneity, and the number of agents influence learning efficiency.

Contribution

It introduces a new information-theoretic lower bound and a two-stage collaborative algorithm that achieves this bound, advancing understanding of multi-agent bandit problems.

Findings

01

Optimal regret bounds depend on the number of agents, rounds, and heterogeneity.

02

Collaboration can significantly reduce regret compared to non-collaborative approaches.

03

The proposed algorithm matches the theoretical lower bounds, demonstrating its optimality.

Abstract

Stochastic linear bandits are a fundamental model for sequential decision making, where an agent selects a vector-valued action and receives a noisy reward with expected value given by an unknown linear function. Although well studied in the single-agent setting, many real-world scenarios involve multiple agents solving heterogeneous bandit problems, each with a different unknown parameter. Applying single agent algorithms independently ignores cross-agent similarity and learning opportunities. This paper investigates the optimal regret achievable in collaborative personalized linear bandits. We provide an information-theoretic lower bound that characterizes how the number of agents, the interaction rounds, and the degree of heterogeneity jointly affect regret. We then propose a new two-stage collaborative algorithm that achieves the optimal regret. Our analysis models heterogeneity via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Mobile Crowdsensing and Crowdsourcing