On the optimal regret of collaborative personalized linear bandits
Bruce Huang, Ruida Zhou, Lin F. Yang, Suhas Diggavi

TL;DR
This paper characterizes the optimal regret bounds in collaborative personalized linear bandits, revealing how collaboration, heterogeneity, and the number of agents influence learning efficiency.
Contribution
It introduces a new information-theoretic lower bound and a two-stage collaborative algorithm that achieves this bound, advancing understanding of multi-agent bandit problems.
Findings
Optimal regret bounds depend on the number of agents, rounds, and heterogeneity.
Collaboration can significantly reduce regret compared to non-collaborative approaches.
The proposed algorithm matches the theoretical lower bounds, demonstrating its optimality.
Abstract
Stochastic linear bandits are a fundamental model for sequential decision making, where an agent selects a vector-valued action and receives a noisy reward with expected value given by an unknown linear function. Although well studied in the single-agent setting, many real-world scenarios involve multiple agents solving heterogeneous bandit problems, each with a different unknown parameter. Applying single agent algorithms independently ignores cross-agent similarity and learning opportunities. This paper investigates the optimal regret achievable in collaborative personalized linear bandits. We provide an information-theoretic lower bound that characterizes how the number of agents, the interaction rounds, and the degree of heterogeneity jointly affect regret. We then propose a new two-stage collaborative algorithm that achieves the optimal regret. Our analysis models heterogeneity via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Mobile Crowdsensing and Crowdsourcing
