Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

Rui Zhao; Xudong Sun; Volker Tresp

arXiv:1905.08786·cs.LG·May 26, 2020·47 cites

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

Rui Zhao, Xudong Sun, Volker Tresp

PDF

Open Access 3 Repos

TL;DR

This paper introduces a maximum entropy-regularized approach for multi-goal reinforcement learning, promoting goal diversity and improving performance and sample efficiency on robotic tasks.

Contribution

It proposes a novel weighted entropy objective and a maximum entropy prioritization framework, enhancing multi-goal RL by encouraging diverse goal achievement.

Findings

01

Improved performance on multi-goal robotic tasks.

02

Enhanced sample efficiency with the proposed method.

03

Outperforms baseline algorithms in experiments.

Abstract

In Multi-Goal Reinforcement Learning, an agent learns to achieve multiple goals with a goal-conditioned policy. During learning, the agent first collects the trajectories into a replay buffer, and later these trajectories are selected randomly for replay. However, the achieved goals in the replay buffer are often biased towards the behavior policies. From a Bayesian perspective, when there is no prior knowledge about the target goal distribution, the agent should learn uniformly from diverse achieved goals. Therefore, we first propose a novel multi-goal RL objective based on weighted entropy. This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals. Secondly, we developed a maximum entropy-based prioritization framework to optimize the proposed objective. For evaluation of this framework, we combine it with Deep Deterministic Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques

MethodsExperience Replay