TL;DR
This paper introduces IEOE, a new procedure for evaluating the robustness of Off-Policy Evaluation estimators to hyperparameter and policy changes, demonstrated on real-world datasets to improve estimator reliability.
Contribution
The paper develops IEOE, an interpretable experimental protocol for assessing the robustness of OPE estimators, addressing a key challenge in estimator selection and tuning.
Findings
IEOE effectively evaluates estimator robustness to hyperparameters.
It helps identify safe and reliable OPE estimators.
Demonstrated on real-world datasets for practical application.
Abstract
Off-policy Evaluation (OPE), or offline evaluation in general, evaluates the performance of hypothetical policies leveraging only offline log data. It is particularly useful in applications where the online interaction involves high stakes and expensive setting such as precision medicine and recommender systems. Since many OPE estimators have been proposed and some of them have hyperparameters to be tuned, there is an emerging challenge for practitioners to select and tune OPE estimators for their specific application. Unfortunately, identifying a reliable estimator from results reported in research papers is often difficult because the current experimental procedure evaluates and compares the estimators' performance on a narrow set of hyperparameters and evaluation policies. Therefore, it is difficult to know which estimator is safe and reliable to use. In this work, we develop…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
