Horde of Bandits using Gaussian Markov Random Fields
Sharan Vaswani, Mark Schmidt, Laks V.S. Lakshmanan

TL;DR
This paper introduces a scalable approach to the gang of bandits model by leveraging Gaussian Markov random fields, enabling efficient learning in large, graph-structured bandit problems with theoretical guarantees and practical algorithms.
Contribution
It connects GOB to GMRFs to improve scalability and proposes a Thompson sampling algorithm with regret bounds, also offering a heuristic for learning the graph structure dynamically.
Findings
Scalable GOB model using GMRFs demonstrated on large graphs.
Thompson sampling with GMRF sampling-by-perturbation outperforms clustering methods.
Effective graph learning heuristic shown to work well in experiments.
Abstract
The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph. This model is useful in problems like recommender systems where the large number of users makes it vital to transfer information between users. Despite its effectiveness, the existing GOB model can only be applied to small problems due to its quadratic time-dependence on the number of nodes. Existing solutions to combat the scalability issue require an often-unrealistic clustering assumption. By exploiting a connection to Gaussian Markov random fields (GMRFs), we show that the GOB model can be made to scale to much larger graphs without additional assumptions. In addition, we propose a Thompson sampling algorithm which uses the recent GMRF sampling-by-perturbation technique, allowing it to scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Recommender Systems and Techniques
