Unbiased Top-k Learning to Rank with Causal Likelihood Decomposition

Haiyuan Zhao; Jun Xu; Xiao Zhang; Guohao Cai; Zhenhua Dong; Ji-Rong; Wen

arXiv:2204.00815·cs.IR·June 14, 2024

Unbiased Top-k Learning to Rank with Causal Likelihood Decomposition

Haiyuan Zhao, Jun Xu, Xiao Zhang, Guohao Cai, Zhenhua Dong, Ji-Rong, Wen

PDF

Open Access 1 Repo

TL;DR

This paper introduces Causal Likelihood Decomposition (CLD), a novel unified method that effectively mitigates both position bias and sample selection bias in top-k learning to rank, improving ranking accuracy.

Contribution

The paper proposes CLD, a causal graph-based approach that simultaneously addresses position bias and sample selection bias in top-k ranking models, unifying pointwise and pairwise learning.

Findings

01

CLD outperforms baseline methods in mitigating biases.

02

CLD is robust to bias severity and click noise.

03

The approach is theoretically sound and versatile.

Abstract

Unbiased learning to rank has been proposed to alleviate the biases in the search ranking, making it possible to train ranking models with user interaction data. In real applications, search engines are designed to display only the most relevant k documents from the retrieved candidate set. The rest candidates are discarded. As a consequence, position bias and sample selection bias usually occur simultaneously. Existing unbiased learning to rank approaches either focus on one type of bias (e.g., position bias) or mitigate the position bias and sample selection bias with separate components, overlooking their associations. In this study, we first analyze the mechanisms and associations of position bias and sample selection bias from the viewpoint of a causal graph. Based on the analysis, we propose Causal Likelihood Decomposition (CLD), a unified approach to simultaneously mitigating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hyz20/cld
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Bayesian Modeling and Causal Inference · Domain Adaptation and Few-Shot Learning