Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service
Yuta Saito, Takuma Udagawa, and Kei Tateno

TL;DR
This paper proposes a data-driven method for selecting the most appropriate off-policy evaluation estimator tailored to specific application settings, demonstrated through real-world online content delivery and marketing scenarios.
Contribution
It introduces a practical estimator selection procedure for off-policy evaluation, addressing the challenge of choosing suitable estimators for different real-world applications.
Findings
Estimator suitability varies with outcome definitions.
The selection procedure effectively identifies appropriate estimators.
Proper estimator selection improves evaluation accuracy in practice.
Abstract
Off-policy evaluation (OPE) is the method that attempts to estimate the performance of decision making policies using historical data generated by different policies without conducting costly online A/B tests. Accurate OPE is essential in domains such as healthcare, marketing or recommender systems to avoid deploying poor performing policies, as such policies may hart human lives or destroy the user experience. Thus, many OPE methods with theoretical backgrounds have been proposed. One emerging challenge with this trend is that a suitable estimator can be different for each application setting. It is often unknown for practitioners which estimator to use for their specific applications and purposes. To find out a suitable estimator among many candidates, we use a data-driven estimator selection procedure for off-policy policy performance estimators as a practical solution. As proof of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Energy Management · Advanced Causal Inference Techniques · Recommender Systems and Techniques
Methodstravel james
