H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps
Haoyi Niu, Tianying Ji, Bingqi Liu, Haocheng Zhao, Xiangyu Zhu,, Jianying Zheng, Pengfei Huang, Guyue Zhou, Jianming Hu, Xianyuan Zhan

TL;DR
H2O+ is a new hybrid RL framework that effectively combines offline data and imperfect simulations to improve real-world task performance despite dynamics gaps.
Contribution
The paper introduces H2O+, a flexible algorithm that bridges offline and online RL methods while addressing dynamics gaps between simulation and reality.
Findings
H2O+ outperforms existing cross-domain RL algorithms in simulations.
H2O+ demonstrates superior real-world robotics performance.
The framework offers high flexibility in combining offline and online learning.
Abstract
Solving real-world complex tasks using reinforcement learning (RL) without high-fidelity simulation environments or large amounts of offline data can be quite challenging. Online RL agents trained in imperfect simulation environments can suffer from severe sim-to-real issues. Offline RL approaches although bypass the need for simulators, often pose demanding requirements on the size and quality of the offline datasets. The recently emerged hybrid offline-and-online RL provides an attractive framework that enables joint use of limited offline data and imperfect simulator for transferable policy learning. In this paper, we develop a new algorithm, called H2O+, which offers great flexibility to bridge various choices of offline and online learning methods, while also accounting for dynamics gaps between the real and simulation environment. Through extensive simulation and real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications
