Doubly Robust Off-Policy Evaluation for Ranking Policies under the   Cascade Behavior Model

Haruka Kiyohara; Yuta Saito; Tatsuya Matsuhiro; Yusuke Narita,; Nobuyuki Shimizu; Yasuo Yamamoto

arXiv:2202.01562·stat.ML·February 4, 2022

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita,, Nobuyuki Shimizu, Yasuo Yamamoto

PDF

2 Repos

TL;DR

This paper introduces a new off-policy evaluation method for ranking policies that balances bias and variance by leveraging the cascade user behavior model, improving accuracy in real-world recommender systems.

Contribution

It proposes the Cascade Doubly Robust estimator, which is unbiased under broader conditions and reduces variance using a control variate, advancing ranking policy evaluation.

Findings

01

The estimator outperforms existing methods in synthetic data.

02

It achieves more accurate evaluations on real-world datasets.

03

It effectively balances bias and variance in ranking OPE.

Abstract

In real-world recommender systems and search engines, optimizing ranking decisions to present a ranked list of relevant items is critical. Off-policy evaluation (OPE) for ranking policies is thus gaining a growing interest because it enables performance estimation of new ranking policies using only logged data. Although OPE in contextual bandits has been studied extensively, its naive application to the ranking setting faces a critical variance issue due to the huge item space. To tackle this problem, previous studies introduce some assumptions on user behavior to make the combinatorial item space tractable. However, an unrealistic assumption may, in turn, cause serious bias. Therefore, appropriately controlling the bias-variance tradeoff by imposing a reasonable assumption is the key for success in OPE of ranking policies. To achieve a well-balanced bias-variance tradeoff, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.