Hybrid Cross-domain Robust Reinforcement Learning
Linh Le Pham Van, Minh Hoang Nguyen, Hung Le, Hung The Tran, Sunil Gupta

TL;DR
HYDRO introduces a hybrid framework combining offline data and online simulation to enhance robust reinforcement learning, effectively addressing dynamics mismatch and improving sample efficiency in uncertain environments.
Contribution
This paper presents HYDRO, the first hybrid cross-domain robust RL framework that leverages both offline datasets and online simulators with novel uncertainty filtering.
Findings
HYDRO outperforms existing robust RL methods across multiple tasks.
HYDRO improves sample efficiency in offline robust RL scenarios.
HYDRO effectively reduces performance gaps between simulator and real environment.
Abstract
Robust reinforcement learning (RL) aims to learn policies that remain effective despite uncertainties in its environment, which frequently arise in real-world applications due to variations in environment dynamics. The robust RL methods learn a robust policy by maximizing value under the worst-case models within a predefined uncertainty set. Offline robust RL algorithms are particularly promising in scenarios where only a fixed dataset is available and new data cannot be collected. However, these approaches often require extensive offline data, and gathering such datasets for specific tasks in specific environments can be both costly and time-consuming. Using an imperfect simulator offers a faster, cheaper, and safer way to collect data for training, but it can suffer from dynamics mismatch. In this paper, we introduce HYDRO, the first Hybrid Cross-Domain Robust RL framework designed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning
