# Addressing Personalized Bias for Unbiased Learning to Rank

**Authors:** Zechun Niu, Lang Mei, Liu Yang, Ziyuan Zhao, Qiang Yan, Jiaxin Mao, and Ji-Rong Wen

arXiv: 2508.20798 · 2025-08-29

## TL;DR

This paper introduces a user-aware unbiased learning to rank framework that models individual user behaviors to correct personalized biases, improving ranking accuracy in web search.

## Contribution

It proposes a novel user-aware inverse-propensity-score estimator that accounts for personalized user behaviors, addressing biases overlooked by previous methods.

## Key findings

- The user-aware estimator is theoretically unbiased under mild assumptions.
- The estimator demonstrates lower variance than user-oblivious methods.
- Experimental results show improved ranking performance on real and semi-synthetic datasets.

## Abstract

Unbiased learning to rank (ULTR), which aims to learn unbiased ranking models from biased user behavior logs, plays an important role in Web search. Previous research on ULTR has studied a variety of biases in users' clicks, such as position bias, presentation bias, and outlier bias. However, existing work often assumes that the behavior logs are collected from an ``average'' user, neglecting the differences between different users in their search and browsing behaviors. In this paper, we introduce personalized factors into the ULTR framework, which we term the user-aware ULTR problem. Through a formal causal analysis of this problem, we demonstrate that existing user-oblivious methods are biased when different users have different preferences over queries and personalized propensities of examining documents. To address such a personalized bias, we propose a novel user-aware inverse-propensity-score estimator for learning-to-rank objectives. Specifically, our approach models the distribution of user browsing behaviors for each query and aggregates user-weighted examination probabilities to determine propensities. We theoretically prove that the user-aware estimator is unbiased under some mild assumptions and shows lower variance compared to the straightforward way of calculating a user-dependent propensity for each impression. Finally, we empirically verify the effectiveness of our user-aware estimator by conducting extensive experiments on two semi-synthetic datasets and a real-world dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20798/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20798/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/2508.20798/full.md

---
Source: https://tomesphere.com/paper/2508.20798