Improving Offline Reinforcement Learning with Inaccurate Simulators
Yiwen Hou, Haoyuan Sun, Jinming Ma, Feng Wu

TL;DR
This paper introduces a method that combines offline datasets with data from inaccurate simulators using a GAN-based approach, improving offline RL performance in robotic tasks.
Contribution
It proposes a novel GAN-based technique to better integrate inaccurate simulation data with offline datasets for enhanced reinforcement learning.
Findings
Outperforms state-of-the-art methods on D4RL benchmark
Effective in real-world manipulation tasks
Utilizes GAN to reweight simulated data
Abstract
Offline reinforcement learning (RL) provides a promising approach to avoid costly online interaction with the real environment. However, the performance of offline RL highly depends on the quality of the datasets, which may cause extrapolation error in the learning process. In many robotic applications, an inaccurate simulator is often available. However, the data directly collected from the inaccurate simulator cannot be directly used in offline RL due to the well-known exploration-exploitation dilemma and the dynamic gap between inaccurate simulation and the real environment. To address these issues, we propose a novel approach to combine the offline dataset and the inaccurate simulation data in a better manner. Specifically, we pre-train a generative adversarial network (GAN) model to fit the state distribution of the offline dataset. Given this, we collect data from the inaccurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
