Empirical Study of Off-Policy Policy Evaluation for Reinforcement   Learning

Cameron Voloshin; Hoang M. Le; Nan Jiang; Yisong Yue

arXiv:1911.06854·cs.LG·November 30, 2021·68 cites

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue

PDF

Open Access 3 Repos

TL;DR

This paper presents a comprehensive empirical benchmark for off-policy policy evaluation in reinforcement learning, emphasizing diverse experimental designs and providing practical guidelines for real-world applications.

Contribution

It introduces the Caltech OPE Benchmarking Suite (COBS), a standardized platform for stress testing OPE methods and analyzing their performance across various scenarios.

Findings

01

Diverse experimental setups reveal strengths and weaknesses of different OPE methods.

02

Guidelines for practitioners to select appropriate OPE techniques based on empirical evidence.

03

Open-source software facilitates reproducibility and further research in OPE evaluation.

Abstract

We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications. Given the increasing interest in deploying learning-based methods, there has been a flurry of recent proposals for OPE method, leading to a need for standardized empirical analyses. Our work takes a strong focus on diversity of experimental design to enable stress testing of OPE methods. We provide a comprehensive benchmarking suite to study the interplay of different attributes on method performance. We distill the results into a summarized set of guidelines for OPE in practice. Our software package, the Caltech OPE Benchmarking Suite (COBS), is open-sourced and we invite interested researchers to further contribute to the benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Software Reliability and Analysis Research · Formal Methods in Verification