Cross-Validated Off-Policy Evaluation

Matej Cief; Branislav Kveton; Michal Kompan

arXiv:2405.15332·cs.LG·December 23, 2024

Cross-Validated Off-Policy Evaluation

Matej Cief, Branislav Kveton, Michal Kompan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates how cross-validation can be effectively used for off-policy evaluation, providing practical guidance and empirical evidence to improve estimator selection and hyper-parameter tuning in this context.

Contribution

It introduces a novel approach to applying cross-validation in off-policy evaluation, challenging the belief that it is infeasible and offering practical tools for practitioners.

Findings

01

Cross-validation improves estimator selection in off-policy evaluation.

02

The proposed method performs well across various use cases.

03

Empirical results validate the effectiveness of the approach.

Abstract

We study estimator selection and hyper-parameter tuning in off-policy evaluation. Although cross-validation is the most popular method for model selection in supervised learning, off-policy evaluation relies mostly on theory, which provides only limited guidance to practitioners. We show how to use cross-validation for off-policy evaluation. This challenges a popular belief that cross-validation in off-policy evaluation is not feasible. We evaluate our method empirically and show that it addresses a variety of use cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

navarog/cross-validated-ope
pytorchOfficial

Videos

Cross-Validated Off-Policy Evaluation· underline

Taxonomy

TopicsEvaluation and Performance Assessment