Cascade Model-based Propensity Estimation for Counterfactual Learning to Rank
Ali Vardasbi, Maarten de Rijke, Ilya Markov

TL;DR
This paper introduces CM-IPS, a new propensity estimation method tailored for cascade user click behavior in search ranking, improving counterfactual learning to rank performance when clicks follow the cascade model.
Contribution
The paper proposes CM-IPS, a propensity estimation method designed for cascade click behavior, and provides a strategy to select between cascade and PBM-based methods based on user click data.
Findings
CM-IPS maintains CLTR performance close to full-information in cascade scenarios.
PBM-based CLTR performs poorly when user clicks follow the cascade model.
A method to choose between CM- and PBM-based propensity estimations based on historical clicks.
Abstract
Unbiased CLTR requires click propensities to compensate for the difference between user clicks and true relevance of search results via IPS. Current propensity estimation methods assume that user click behavior follows the PBM and estimate click propensities based on this assumption. However, in reality, user clicks often follow the CM, where users scan search results from top to bottom and where each next click depends on the previous one. In this cascade scenario, PBM-based estimates of propensities are not accurate, which, in turn, hurts CLTR performance. In this paper, we propose a propensity estimation method for the cascade scenario, called CM-IPS. We show that CM-IPS keeps CLTR performance close to the full-information performance in case the user clicks follow the CM, while PBM-based CLTR has a significant gap towards the full-information. The opposite is true if the user clicks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
