MRHER: Model-based Relay Hindsight Experience Replay for Sequential Object Manipulation Tasks with Sparse Rewards
Yuming Huang, Bin Ren, Ziming Xu, Lianghong Wu

TL;DR
MRHER is a novel model-based RL framework that improves sample efficiency in sequential object manipulation tasks with sparse rewards by breaking down tasks and using a new relabeling method called Foresight relabeling.
Contribution
It introduces MRHER, which combines task decomposition with a robust model-based relabeling technique to enhance learning efficiency in goal-conditioned RL.
Findings
Outperforms RHER by 13.79% in FetchPush-v1
Outperforms RHER by 14.29% in FetchPickandPlace-v1
Achieves state-of-the-art sample efficiency in benchmark tasks
Abstract
Sparse rewards pose a significant challenge to achieving high sample efficiency in goal-conditioned reinforcement learning (RL). Specifically, in sequential manipulation tasks, the agent receives failure rewards until it successfully completes the entire manipulation task, which leads to low sample efficiency. To tackle this issue and improve sample efficiency, we propose a novel model-based RL framework called Model-based Relay Hindsight Experience Replay (MRHER). MRHER breaks down a continuous task into subtasks with increasing complexity and utilizes the previous subtask to guide the learning of the subsequent one. Instead of using Hindsight Experience Replay (HER) in every subtask, we design a new robust model-based relabeling method called Foresight relabeling (FR). FR predicts the future trajectory of the hindsight state and relabels the expected goal as a goal achieved on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function
MethodsExperience Replay
