Unidentified and Confounded? Understanding Two-Tower Models for Unbiased Learning to Rank

Philipp Hager; Onno Zoeter; Maarten de Rijke

arXiv:2506.20501·cs.IR·June 26, 2025

Unidentified and Confounded? Understanding Two-Tower Models for Unbiased Learning to Rank

Philipp Hager, Onno Zoeter, Maarten de Rijke

PDF

Open Access 1 Repo

TL;DR

This paper analyzes why two-tower learning-to-rank models sometimes perform worse when trained on click data, exploring confounding factors and identifiability issues, and proposes a mitigation technique.

Contribution

It provides a theoretical analysis of identifiability conditions and the impact of logging policies on two-tower models, along with a sample weighting method to reduce bias.

Findings

01

Logging policies do not bias models if they perfectly capture user behavior.

02

Bias amplification occurs when models poorly capture user behavior, especially with correlated prediction errors.

03

Proposed sample weighting technique helps mitigate bias effects.

Abstract

Additive two-tower models are popular learning-to-rank methods for handling biased user feedback in industry settings. Recent studies, however, report a concerning phenomenon: training two-tower models on clicks collected by well-performing production systems leads to decreased ranking performance. This paper investigates two recent explanations for this observation: confounding effects from logging policies and model identifiability issues. We theoretically analyze the identifiability conditions of two-tower models, showing that either document swaps across positions or overlapping feature distributions are required to recover model parameters from clicks. We also investigate the effect of logging policies on two-tower models, finding that they introduce no bias when models perfectly capture user behavior. However, logging policies can amplify biases when models imperfectly capture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

philipphager/two-tower-confounding
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning