Enabling Option Learning in Sparse Rewards with Hindsight Experience Replay

Gabriel Romio; Mateus Begnini Melchiades; Bruno Castro da Silva; Gabriel de Oliveira Ramos

arXiv:2602.13865·cs.AI·February 17, 2026

Enabling Option Learning in Sparse Rewards with Hindsight Experience Replay

Gabriel Romio, Mateus Begnini Melchiades, Bruno Castro da Silva, Gabriel de Oliveira Ramos

PDF

Open Access

TL;DR

This paper introduces MOC-2HER, a novel method combining hierarchical reinforcement learning with dual goal relabeling, significantly improving success rates in sparse reward robotic manipulation tasks.

Contribution

It proposes a dual objectives HER extension that enhances hierarchical RL in multi-goal sparse reward environments, especially for object manipulation.

Findings

01

MOC-2HER achieves up to 90% success rate in robotic tasks.

02

Standard MOC and MOC-HER achieve less than 11% success.

03

Dual goal relabeling improves learning efficiency in sparse rewards.

Abstract

Hierarchical Reinforcement Learning (HRL) frameworks like Option-Critic (OC) and Multi-updates Option Critic (MOC) have introduced significant advancements in learning reusable options. However, these methods underperform in multi-goal environments with sparse rewards, where actions must be linked to temporally distant outcomes. To address this limitation, we first propose MOC-HER, which integrates the Hindsight Experience Replay (HER) mechanism into the MOC framework. By relabeling goals from achieved outcomes, MOC-HER can solve sparse reward environments that are intractable for the original MOC. However, this approach is insufficient for object manipulation tasks, where the reward depends on the object reaching the goal rather than on the agent's direct interaction. This makes it extremely difficult for HRL agents to discover how to interact with these objects. To overcome this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Social Robot Interaction and HRI · Domain Adaptation and Few-Shot Learning