Real-world Reinforcement Learning from Suboptimal Interventions

Yinuo Zhao; Huiqian Jin; Lechun Jiang; Xinyi Zhang; Kun Wu; Pei Ren; Zhiyuan Xu; Zhengping Che; Lei Sun; Dapeng Wu; Chi Harold Liu; Jian Tang

arXiv:2512.24288·cs.RO·January 1, 2026

Real-world Reinforcement Learning from Suboptimal Interventions

Yinuo Zhao, Huiqian Jin, Lechun Jiang, Xinyi Zhang, Kun Wu, Pei Ren, Zhiyuan Xu, Zhengping Che, Lei Sun, Dapeng Wu, Chi Harold Liu, Jian Tang

PDF

Open Access

TL;DR

This paper introduces SiLRI, a novel constrained reinforcement learning algorithm that effectively leverages suboptimal human interventions to accelerate real-world robotic manipulation learning, outperforming existing methods.

Contribution

We propose a state-wise Lagrangian RL approach that incorporates human intervention uncertainty, enabling robots to learn efficiently from suboptimal human guidance in real-world tasks.

Findings

01

SiLRI reduces training time by at least 50% compared to HIL-SERL.

02

Achieves 100% success rate on long-horizon tasks.

03

Effectively exploits suboptimal human interventions in real-world experiments.

Abstract

Real-world reinforcement learning (RL) offers a promising approach to training precise and dexterous robotic manipulation policies in an online manner, enabling robots to learn from their own experience while gradually reducing human labor. However, prior real-world RL methods often assume that human interventions are optimal across the entire state space, overlooking the fact that even expert operators cannot consistently provide optimal actions in all states or completely avoid mistakes. Indiscriminately mixing intervention data with robot-collected data inherits the sample inefficiency of RL, while purely imitating intervention data can ultimately degrade the final performance achievable by RL. The question of how to leverage potentially suboptimal and noisy human interventions to accelerate learning without being constrained by them thus remains open. To address this challenge, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms