Minimally Invasive Randomization for Collecting Unbiased Preferences from Clickthrough Logs
Filip Radlinski, Thorsten Joachims

TL;DR
This paper presents a simple, provably unbiased method for collecting relevance judgments from clickthrough logs by minimally altering search result presentation, validated through real-world experiments and convergence guarantees.
Contribution
It introduces a minimally invasive randomization technique that removes presentation bias from click data, enabling unbiased learning of search rankings.
Findings
The method produces unbiased relevance judgments under reasonable assumptions.
Experiments confirm the effectiveness of the approach in real-world settings.
The approach guarantees convergence to an optimal ranking with enough data.
Abstract
Clickthrough data is a particularly inexpensive and plentiful resource to obtain implicit relevance feedback for improving and personalizing search engines. However, it is well known that the probability of a user clicking on a result is strongly biased toward documents presented higher in the result set irrespective of relevance. We introduce a simple method to modify the presentation of search results that provably gives relevance judgments that are unaffected by presentation bias under reasonable assumptions. We validate this property of the training data in interactive real world experiments. Finally, we show that using these unbiased relevance judgments learning methods can be guaranteed to converge to an ideal ranking given sufficient data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Mobile Crowdsensing and Crowdsourcing · Expert finding and Q&A systems
