Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach

Fatemeh Ghaffari; Siddarth Sitaraman; Xutong Liu; Xuchuang Wang; Mohammad Hajiesmaili

arXiv:2511.03074·cs.LG·November 6, 2025

Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach

Fatemeh Ghaffari, Siddarth Sitaraman, Xutong Liu, Xuchuang Wang, Mohammad Hajiesmaili

PDF

Open Access

TL;DR

This paper introduces MSUCB, a robust online learning to rank algorithm that effectively handles corrupted feedback, maintaining high performance and low regret even under adversarial manipulations.

Contribution

The paper proposes MSUCB, a novel mean-of-medians estimator-based algorithm that is the first to address corruption in bandit-based online ranking systems.

Findings

01

Achieves optimal logarithmic regret without corruption

02

Degrades gracefully under corruption with minimal regret increase

03

Outperforms prior methods with up to 97.35% regret improvement

Abstract

Online learning to rank (OLTR) studies how to recommend a short ranked list of items from a large pool and improves future rankings based on user clicks. This setting is commonly modeled as cascading bandits, where the objective is to maximize the likelihood that the user clicks on at least one of the presented items across as many timesteps as possible. However, such systems are vulnerable to click fraud and other manipulations (i.e., corruption), where bots or paid click farms inject corrupted feedback that misleads the learning process and degrades user experience. In this paper, we propose MSUCB, a robust algorithm that incorporates a novel mean-of-medians estimator, which to our knowledge is applied to bandits with corruption setting for the first time. This estimator behaves like a standard mean in the absence of corruption, so no cost is paid for robustness. Under corruption, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Recommender Systems and Techniques