DisTop: Discovering a Topological representation to learn diverse and rewarding skills
Arthur Aubret, Laetitia matignon, Salima Hassas

TL;DR
DisTop introduces a novel topological approach for deep reinforcement learning that enables the discovery of diverse, rewarding skills through an unsupervised, hierarchical, and environment-agnostic framework, improving exploration especially in sparse reward settings.
Contribution
DisTop presents a new method combining topological environment modeling with hierarchical skill discovery, advancing exploration and performance in DRL.
Findings
DisTop is effective across high-dimensional data, images, and proprioceptive inputs.
It achieves state-of-the-art results on MuJoCo benchmarks.
DisTop outperforms existing hierarchical RL methods in sparse reward scenarios.
Abstract
The optimal way for a deep reinforcement learning (DRL) agent to explore is to learn a set of skills that achieves a uniform distribution of states. Following this,we introduce DisTop, a new model that simultaneously learns diverse skills and focuses on improving rewarding skills. DisTop progressively builds a discrete topology of the environment using an unsupervised contrastive loss, a growing network and a goal-conditioned policy. Using this topology, a state-independent hierarchical policy can select where the agent has to keep discovering skills in the state space. In turn, the newly visited states allows an improved learnt representation and the learning loop continues. Our experiments emphasize that DisTop is agnostic to the ground state representation and that the agent can discover the topology of its environment whether the states are high-dimensional binary data, images, or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks
