Horde of Bandits using Gaussian Markov Random Fields

Sharan Vaswani; Mark Schmidt; Laks V.S. Lakshmanan

arXiv:1703.02626·cs.LG·March 9, 2017·5 cites

Horde of Bandits using Gaussian Markov Random Fields

Sharan Vaswani, Mark Schmidt, Laks V.S. Lakshmanan

PDF

Open Access

TL;DR

This paper introduces a scalable approach to the gang of bandits model by leveraging Gaussian Markov random fields, enabling efficient learning in large, graph-structured bandit problems with theoretical guarantees and practical algorithms.

Contribution

It connects GOB to GMRFs to improve scalability and proposes a Thompson sampling algorithm with regret bounds, also offering a heuristic for learning the graph structure dynamically.

Findings

01

Scalable GOB model using GMRFs demonstrated on large graphs.

02

Thompson sampling with GMRF sampling-by-perturbation outperforms clustering methods.

03

Effective graph learning heuristic shown to work well in experiments.

Abstract

The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph. This model is useful in problems like recommender systems where the large number of users makes it vital to transfer information between users. Despite its effectiveness, the existing GOB model can only be applied to small problems due to its quadratic time-dependence on the number of nodes. Existing solutions to combat the scalability issue require an often-unrealistic clustering assumption. By exploiting a connection to Gaussian Markov random fields (GMRFs), we show that the GOB model can be made to scale to much larger graphs without additional assumptions. In addition, we propose a Thompson sampling algorithm which uses the recent GMRF sampling-by-perturbation technique, allowing it to scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Recommender Systems and Techniques