Offline Evaluation of Ranked Lists using Parametric Estimation of Propensities
Vishwa Vinay, Manoj Kilaru, David Arbour

TL;DR
This paper introduces a parametric approach to offline evaluation of ranked lists, improving the accuracy of estimating new ranking effectiveness using click data and learning-to-rank methods.
Contribution
It proposes a novel parametric propensity estimation method that enhances offline evaluation accuracy for ranking systems, especially when new rankings differ from logged data.
Findings
Parametric propensity estimation improves evaluation accuracy.
Leverages learning-to-rank methods as subroutines.
Effective for evaluating new rankings offline.
Abstract
Search engines and recommendation systems attempt to continually improve the quality of the experience they afford to their users. Refining the ranker that produces the lists displayed in response to user requests is an important component of this process. A common practice is for the service providers to make changes (e.g. new ranking features, different ranking models) and A/B test them on a fraction of their users to establish the value of the change. An alternative approach estimates the effectiveness of the proposed changes offline, utilising previously collected clickthrough data on the old ranker to posit what the user behaviour on ranked lists produced by the new ranker would have been. A majority of offline evaluation approaches invoke the well studied inverse propensity weighting to adjust for biases inherent in logged data. In this paper, we propose the use of parametric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Voting Systems · Consumer Market Behavior and Pricing · Advanced Causal Inference Techniques
Methodstravel james
