Cascading Bandits for Large-Scale Recommendation Problems
Shi Zong, Hao Ni, Kenny Sung, Nan Rosemary Ke, Zheng Wen, and, Branislav Kveton

TL;DR
This paper introduces two algorithms for cascading bandits that leverage linear generalization to efficiently recommend the top K items from a large set, with regret bounds and strong empirical performance.
Contribution
The work proposes novel linear generalization-based algorithms for cascading bandits, reducing dependence on the number of items and improving recommendation efficiency.
Findings
Algorithms outperform baselines in experiments.
Regret bounds are established for one algorithm.
Linear generalization improves scalability and accuracy.
Abstract
Most recommender systems recommend a list of items. The user examines the list, from the first item to the last, and often chooses the first attractive item and does not examine the rest. This type of user behavior can be modeled by the cascade model. In this work, we study cascading bandits, an online learning variant of the cascade model where the goal is to recommend most attractive items from a large set of candidate items. We propose two algorithms for solving this problem, which are based on the idea of linear generalization. The key idea in our solutions is that we learn a predictor of the attraction probabilities of items from their features, as opposing to learning the attraction probability of each item independently as in the existing work. This results in practical learning algorithms whose regret does not depend on the number of items . We bound the regret of one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Recommender Systems and Techniques
