On Context-Dependent Clustering of Bandits
Claudio Gentile, Shuai Li, Purushottam Kar, Alexandros Karatzoglou,, Evans Etrue, Giovanni Zappella

TL;DR
This paper introduces CAB, a novel context-dependent clustering algorithm for bandit-based recommendation systems, which improves prediction accuracy by effectively sharing feedback among user clusters.
Contribution
CAB integrates collaborative effects into bandit inference and learning, with proven regret bounds and superior performance on real-world datasets.
Findings
CAB outperforms existing methods in prediction accuracy
Regret bounds depend on the number of user clusters
Experiments confirm significant performance improvements
Abstract
We investigate a novel cluster-of-bandit algorithm CAB for collaborative recommendation tasks that implements the underlying feedback sharing mechanism by estimating the neighborhood of users in a context-dependent manner. CAB makes sharp departures from the state of the art by incorporating collaborative effects into inference as well as learning processes in a manner that seamlessly interleaving explore-exploit tradeoffs and collaborative steps. We prove regret bounds under various assumptions on the data, which exhibit a crisp dependence on the expected number of clusters over the users, a natural measure of the statistical difficulty of the learning task. Experiments on production and real-world datasets show that CAB offers significantly increased prediction performance against a representative pool of state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
