Marginalized Importance Sampling for Off-Environment Policy Evaluation

Pulkit Katdare; Nan Jiang; Katherine Driggs-Campbell

arXiv:2309.01807·cs.LG·October 6, 2023

Marginalized Importance Sampling for Off-Environment Policy Evaluation

Pulkit Katdare, Nan Jiang, Katherine Driggs-Campbell

PDF

Open Access

TL;DR

This paper introduces a novel marginalized importance sampling method that combines simulation and offline data to accurately evaluate policies before real-world deployment, addressing key challenges in density ratio estimation.

Contribution

It proposes a two-step density ratio learning approach using occupancy in the simulator, improving efficiency and robustness in off-environment policy evaluation.

Findings

01

Method generalizes well across different Sim2Sim gaps

02

Achieves accurate policy evaluation in various environments

03

Demonstrates successful transfer to real-world robotic arm validation

Abstract

Reinforcement Learning (RL) methods are typically sample-inefficient, making it challenging to train and deploy RL-policies in real world robots. Even a robust policy trained in simulation requires a real-world deployment to assess their performance. This paper proposes a new approach to evaluate the real-world performance of agent policies prior to deploying them in the real world. Our approach incorporates a simulator along with real-world offline data to evaluate the performance of any policy using the framework of Marginalized Importance Sampling (MIS). Existing MIS methods face two challenges: (1) large density ratios that deviate from a reasonable range and (2) indirect supervision, where the ratio needs to be inferred indirectly, thus exacerbating estimation error. Our approach addresses these challenges by introducing the target policy's occupancy in the simulator as an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Ethics and Social Impacts of AI