Cascading Hybrid Bandits: Online Learning to Rank for Relevance and Diversity
Chang Li, Haoyun Feng, Maarten de Rijke

TL;DR
This paper introduces CascadeHybrid, a novel online learning algorithm for ranking items that balances relevance and diversity by modeling user behavior and learning from click feedback, with proven theoretical guarantees and superior experimental performance.
Contribution
It proposes a hybrid contextual bandit approach for ranking that models relevance and diversity separately and learns from user clicks, advancing online learning to rank methods.
Findings
CascadeHybrid outperforms baseline algorithms on real datasets.
Theoretical guarantees are established for the algorithm's performance.
Experimental results demonstrate improved relevance and diversity balance.
Abstract
Relevance ranking and result diversification are two core areas in modern recommender systems. Relevance ranking aims at building a ranked list sorted in decreasing order of item relevance, while result diversification focuses on generating a ranked list of items that covers a broad range of topics. In this paper, we study an online learning setting that aims to recommend a ranked list with items that maximizes the ranking utility, i.e., a list whose items are relevant and whose topics are diverse. We formulate it as the cascade hybrid bandits (CHB) problem. CHB assumes the cascading user behavior, where a user browses the displayed list from top to bottom, clicks the first attractive item, and stops browsing the rest. We propose a hybrid contextual bandit approach, called CascadeHybrid, for solving this problem. CascadeHybrid models item relevance and topical diversity using two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
