Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao, Xudong Sun, Volker Tresp

TL;DR
This paper introduces a maximum entropy-regularized approach for multi-goal reinforcement learning, promoting goal diversity and improving performance and sample efficiency on robotic tasks.
Contribution
It proposes a novel weighted entropy objective and a maximum entropy prioritization framework, enhancing multi-goal RL by encouraging diverse goal achievement.
Findings
Improved performance on multi-goal robotic tasks.
Enhanced sample efficiency with the proposed method.
Outperforms baseline algorithms in experiments.
Abstract
In Multi-Goal Reinforcement Learning, an agent learns to achieve multiple goals with a goal-conditioned policy. During learning, the agent first collects the trajectories into a replay buffer, and later these trajectories are selected randomly for replay. However, the achieved goals in the replay buffer are often biased towards the behavior policies. From a Bayesian perspective, when there is no prior knowledge about the target goal distribution, the agent should learn uniformly from diverse achieved goals. Therefore, we first propose a novel multi-goal RL objective based on weighted entropy. This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals. Secondly, we developed a maximum entropy-based prioritization framework to optimize the proposed objective. For evaluation of this framework, we combine it with Deep Deterministic Policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques
MethodsExperience Replay
