Dense and Diverse Goal Coverage in Multi Goal Reinforcement Learning
Sagalpreet Singh, Rishi Saket, Aravindan Raghuveer

TL;DR
This paper introduces a novel Multi Goal Reinforcement Learning framework that promotes policies achieving high returns while uniformly covering goal states, addressing exploration and distribution dispersion issues.
Contribution
It formalizes the Multi Goal RL problem and proposes an algorithm that learns a policy mixture with dispersed goal state coverage, backed by theoretical guarantees.
Findings
The algorithm effectively disperses goal state coverage in synthetic and standard environments.
Performance guarantees show convergence bounds for optimizing return and distribution dispersion.
Experiments demonstrate improved goal coverage without sacrificing expected return.
Abstract
Reinforcement Learning algorithms are primarily focused on learning a policy that maximizes expected return. As a result, the learned policy can exploit one or few reward sources. However, in many natural situations, it is desirable to learn a policy that induces a dispersed marginal state distribution over rewarding states, while maximizing the expected return which is typically tied to reaching a goal state. This aspect remains relatively unexplored. Existing techniques based on entropy regularization and intrinsic rewards use stochasticity for encouraging exploration to find an optimal policy which may not necessarily lead to dispersed marginal state distribution over rewarding states. Other RL algorithms which match a target distribution assume the latter to be available apriori. This may be infeasible in large scale systems where enumeration of all states is not possible and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
