Real World Offline Reinforcement Learning with Realistic Data Source
Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind, Rajeswaran, Vikash Kumar

TL;DR
This paper demonstrates that offline reinforcement learning can effectively leverage diverse real-world robot data to outperform imitation learning, emphasizing the importance of realistic datasets for practical robot learning.
Contribution
It provides the first extensive empirical study on real-world offline RL with a large, diverse dataset collected from actual robot operations, highlighting differences from imitation learning.
Findings
ORL outperforms imitation learning on real-world tasks.
Different action spaces are preferred by ORL and imitation learning.
Heterogeneous offline data enables better generalization in ORL.
Abstract
Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience. However, current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories, and thus hold limited relevance for real-world robotics. In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning. Under these settings, we perform an extensive (6500+ trajectories collected over 800+ robot hours and 270+ human labor hour) empirical study evaluating generalization and transfer capabilities of representative ORL methods on four real-world tabletop manipulation tasks. Our study finds that ORL and imitation learning prefer different action spaces, and that ORL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Mobile Crowdsensing and Crowdsourcing
