TL;DR
This paper introduces a Dual Learning Algorithm (DLA) that jointly learns unbiased ranking models and propensity models directly from biased click data, improving over existing methods that treat these estimations separately.
Contribution
The paper proposes a unified dual learning framework that simultaneously estimates click propensities and ranking models, enabling unbiased learning directly from biased data without preprocessing.
Findings
DLA outperforms existing unbiased learning-to-rank methods on synthetic and real-world data.
The joint learning approach adapts to changing bias distributions.
DLA can be applied to online learning scenarios.
Abstract
Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework based on inverse propensity weighting. Despite their differences, most existing studies separate the estimation of click bias (namely the \textit{propensity model}) from the learning of ranking algorithms. To estimate click propensities, they either conduct online result randomization, which can negatively affect the user experience, or offline parameter estimation, which has special requirements for click data and is optimized for objectives (e.g. click likelihood) that are not directly related to the ranking performance of the system. In this work, we address those problems by unifying the learning of propensity models and ranking models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDeep Layer Aggregation
