Real-world Reinforcement Learning from Suboptimal Interventions
Yinuo Zhao, Huiqian Jin, Lechun Jiang, Xinyi Zhang, Kun Wu, Pei Ren, Zhiyuan Xu, Zhengping Che, Lei Sun, Dapeng Wu, Chi Harold Liu, Jian Tang

TL;DR
This paper introduces SiLRI, a novel constrained reinforcement learning algorithm that effectively leverages suboptimal human interventions to accelerate real-world robotic manipulation learning, outperforming existing methods.
Contribution
We propose a state-wise Lagrangian RL approach that incorporates human intervention uncertainty, enabling robots to learn efficiently from suboptimal human guidance in real-world tasks.
Findings
SiLRI reduces training time by at least 50% compared to HIL-SERL.
Achieves 100% success rate on long-horizon tasks.
Effectively exploits suboptimal human interventions in real-world experiments.
Abstract
Real-world reinforcement learning (RL) offers a promising approach to training precise and dexterous robotic manipulation policies in an online manner, enabling robots to learn from their own experience while gradually reducing human labor. However, prior real-world RL methods often assume that human interventions are optimal across the entire state space, overlooking the fact that even expert operators cannot consistently provide optimal actions in all states or completely avoid mistakes. Indiscriminately mixing intervention data with robot-collected data inherits the sample inefficiency of RL, while purely imitating intervention data can ultimately degrade the final performance achievable by RL. The question of how to leverage potentially suboptimal and noisy human interventions to accelerate learning without being constrained by them thus remains open. To address this challenge, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms
