Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach
Fatemeh Ghaffari, Siddarth Sitaraman, Xutong Liu, Xuchuang Wang, Mohammad Hajiesmaili

TL;DR
This paper introduces MSUCB, a robust online learning to rank algorithm that effectively handles corrupted feedback, maintaining high performance and low regret even under adversarial manipulations.
Contribution
The paper proposes MSUCB, a novel mean-of-medians estimator-based algorithm that is the first to address corruption in bandit-based online ranking systems.
Findings
Achieves optimal logarithmic regret without corruption
Degrades gracefully under corruption with minimal regret increase
Outperforms prior methods with up to 97.35% regret improvement
Abstract
Online learning to rank (OLTR) studies how to recommend a short ranked list of items from a large pool and improves future rankings based on user clicks. This setting is commonly modeled as cascading bandits, where the objective is to maximize the likelihood that the user clicks on at least one of the presented items across as many timesteps as possible. However, such systems are vulnerable to click fraud and other manipulations (i.e., corruption), where bots or paid click farms inject corrupted feedback that misleads the learning process and degrades user experience. In this paper, we propose MSUCB, a robust algorithm that incorporates a novel mean-of-medians estimator, which to our knowledge is applied to bandits with corruption setting for the first time. This estimator behaves like a standard mean in the absence of corruption, so no cost is paid for robustness. Under corruption, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Recommender Systems and Techniques
