TL;DR
This paper introduces DTGSH, a method that enhances HER by selecting diverse trajectories and goals using determinantal point processes, leading to faster learning and better performance in robotic manipulation tasks.
Contribution
The paper proposes a novel diversity-based sampling method for HER using DPPs and k-DPPs, improving learning efficiency and outcomes in robotic manipulation.
Findings
DTGSH outperforms state-of-the-art methods in all tested tasks.
The diversity-based approach accelerates learning.
Higher final performance achieved with DTGSH.
Abstract
Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent's experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsExperience Replay
