Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation
Todor Davchev, Oleg Sushkov, Jean-Baptiste Regli, Stefan Schaal, Yusuf, Aytar, Markus Wulfmeier, Jon Scholz

TL;DR
This paper introduces a goal selection method that leverages demonstration-guided hindsight relabelling to improve sample efficiency and success rates in long-horizon, complex robotic manipulation tasks with sparse rewards.
Contribution
It extends hindsight relabelling to incorporate task-specific goal distributions from demonstrations, enhancing exploration and performance in complex manipulation tasks.
Findings
Requires fewer demonstrations to solve tasks.
Achieves higher success rates as task complexity increases.
Demonstrates robustness to input representation quality.
Abstract
Complex sequential tasks in continuous-control settings often require agents to successfully traverse a set of "narrow passages" in their state space. Solving such tasks with a sparse reward in a sample-efficient manner poses a challenge to modern reinforcement learning (RL) due to the associated long-horizon nature of the problem and the lack of sufficient positive signal during learning. Various tools have been applied to address this challenge. When available, large sets of demonstrations can guide agent exploration. Hindsight relabelling on the other hand does not require additional sources of information. However, existing strategies explore based on task-agnostic goal distributions, which can render the solution of long-horizon tasks impractical. In this work, we extend hindsight relabelling mechanisms to guide exploration along task-specific distributions implied by a small set…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Evolutionary Algorithms and Applications
