Online Learning to Rank in Stochastic Click Models
Masrour Zoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton,, Csaba Szepesvari, and Zheng Wen

TL;DR
This paper introduces BatchRank, an online learning to rank algorithm applicable to a broad class of click models, including cascade and position-based models, with proven regret bounds and superior empirical performance.
Contribution
We propose BatchRank, the first algorithm for online learning to rank that works across multiple click models with theoretical guarantees.
Findings
BatchRank outperforms ranked bandits in experiments.
BatchRank is more robust than CascadeKL-UCB.
Theoretical regret bounds are established for BatchRank.
Abstract
Online learning to rank is a core problem in information retrieval and machine learning. Many provably efficient algorithms have been recently proposed for this problem in specific click models. The click model is a model of how the user interacts with a list of documents. Though these results are significant, their impact on practice is limited, because all proposed algorithms are designed for specific click models and lack convergence guarantees in other models. In this work, we propose BatchRank, the first online learning to rank algorithm for a broad class of click models. The class encompasses two most fundamental click models, the cascade and position-based models. We derive a gap-dependent upper bound on the -step regret of BatchRank and evaluate it on a range of web search queries. We observe that BatchRank outperforms ranked bandits and is more robust than CascadeKL-UCB, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
