Minimally Invasive Randomization for Collecting Unbiased Preferences   from Clickthrough Logs

Filip Radlinski; Thorsten Joachims

arXiv:cs/0605037·cs.IR·May 23, 2007·53 cites

Minimally Invasive Randomization for Collecting Unbiased Preferences from Clickthrough Logs

Filip Radlinski, Thorsten Joachims

PDF

Open Access

TL;DR

This paper presents a simple, provably unbiased method for collecting relevance judgments from clickthrough logs by minimally altering search result presentation, validated through real-world experiments and convergence guarantees.

Contribution

It introduces a minimally invasive randomization technique that removes presentation bias from click data, enabling unbiased learning of search rankings.

Findings

01

The method produces unbiased relevance judgments under reasonable assumptions.

02

Experiments confirm the effectiveness of the approach in real-world settings.

03

The approach guarantees convergence to an optimal ranking with enough data.

Abstract

Clickthrough data is a particularly inexpensive and plentiful resource to obtain implicit relevance feedback for improving and personalizing search engines. However, it is well known that the probability of a user clicking on a result is strongly biased toward documents presented higher in the result set irrespective of relevance. We introduce a simple method to modify the presentation of search results that provably gives relevance judgments that are unaffected by presentation bias under reasonable assumptions. We validate this property of the training data in interactive real world experiments. Finally, we show that using these unbiased relevance judgments learning methods can be guaranteed to converge to an ideal ranking given sufficient data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Mobile Crowdsensing and Crowdsourcing · Expert finding and Q&A systems