Energy-Based Hindsight Experience Prioritization
Rui Zhao, Volker Tresp

TL;DR
This paper introduces an energy-based method for prioritizing experiences in reinforcement learning for robotic manipulation, demonstrating improved efficiency and performance over existing methods by focusing on high-energy episodes.
Contribution
The paper proposes a novel energy-based trajectory prioritization framework for HER, leveraging physics principles to enhance learning in robotic tasks.
Findings
Outperforms state-of-the-art methods in four robotic tasks
Improves sample efficiency without extra computational cost
Validates energy-based prioritization as effective for reinforcement learning
Abstract
In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals. However, in previous work, the experience was replayed at random, without considering which episode might be the most valuable for learning. In this paper, we develop an energy-based framework for prioritizing hindsight experience in robotic manipulation tasks. Our approach is inspired by the work-energy principle in physics. We define a trajectory energy function as the sum of the transition energy of the target object over the trajectory. We hypothesize that replaying episodes that have high trajectory energy is more effective for reinforcement learning in robotics. To verify our hypothesis, we designed a framework for hindsight experience prioritization based on the trajectory energy of goal states. The trajectory energy function takes the potential,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Generative Adversarial Networks and Image Synthesis
MethodsExperience Replay
