Inverse Propensity Score based offline estimator for deterministic   ranking lists using position bias

Nick Wood; Sumit Sidana

arXiv:2208.14980·cs.IR·September 1, 2022

Inverse Propensity Score based offline estimator for deterministic ranking lists using position bias

Nick Wood, Sumit Sidana

PDF

Open Access

TL;DR

This paper introduces a new inverse propensity score estimator for deterministic ranking lists that accounts for position bias, enabling more effective offline policy evaluation with industry-scale data.

Contribution

It proposes a novel IPS-based estimator tailored for deterministic policies using position bias modeling, expanding the applicability of offline policy evaluation.

Findings

01

Strong correlation between offline and online results

02

Estimator performs well with accurate user behavior models

03

Validated on industry-scale data

Abstract

In this work, we present a novel way of computing IPS using a position-bias model for deterministic logging policies. This technique significantly widens the policies on which OPE can be used. We validate this technique using two different experiments on industry-scale data. The OPE results are clearly strongly correlated with the online results, with some constant bias. The estimator requires the examination model to be a reasonably accurate approximation of real user behaviour.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Voting Systems · Auction Theory and Applications · Recommender Systems and Techniques