Distributed Clustering of Linear Bandits in Peer to Peer Networks
Nathan Korda, Balazs Szorenyi, Shuai Li

TL;DR
This paper introduces two distributed algorithms for linear bandit problems in peer-to-peer networks, achieving optimal regret rates and cluster discovery with limited communication, validated through real-world experiments.
Contribution
It presents novel distributed confidence ball algorithms for linear bandits, handling both single and clustered peer scenarios with theoretical guarantees.
Findings
Achieves optimal asymptotic regret rates in peer-to-peer settings.
Successfully discovers clusters of peers solving the same problem.
Demonstrates superior performance over existing methods in real-world datasets.
Abstract
We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the peers. For the second, we assume that there are clusters of peers solving the same bandit problem within each cluster, and we prove that our algorithm discovers these clusters, while achieving the optimal asymptotic regret rate within each one. Through experiments on several real-world datasets, we demonstrate the performance of proposed algorithms compared to the state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Reinforcement Learning in Robotics
