Counterfactual Learning to Rank using Heterogeneous Treatment Effect Estimation
Mucun Tian, Chun Guo, Vito Ostuni, Zhen Zhu

TL;DR
This paper introduces a novel approach using heterogeneous treatment effect estimation to debias click data in Learning-to-Rank systems, especially effective with limited intervention data and long tail queries.
Contribution
It proposes a new method for estimating position bias that reduces variance and works well with sparse data, improving unbiased learning-to-rank models.
Findings
Effective in long tail query scenarios
Performs well with limited intervention data
Reduces bias variance in click data
Abstract
Learning-to-Rank (LTR) models trained from implicit feedback (e.g. clicks) suffer from inherent biases. A well-known one is the position bias -- documents in top positions are more likely to receive clicks due in part to their position advantages. To unbiasedly learn to rank, existing counterfactual frameworks first estimate the propensity (probability) of missing clicks with intervention data from a small portion of search traffic, and then use inverse propensity score (IPS) to debias LTR algorithms on the whole data set. These approaches often assume the propensity only depends on the position of the document, which may cause high estimation variance in applications where the search context (e.g. query, user) varies frequently. While context-dependent propensity models reduce variance, accurate estimations may require randomization or intervention on a large amount of traffic, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Advanced Bandit Algorithms Research
