A Unified Algorithm Framework for Unsupervised Discovery of Skills based on Determinantal Point Process
Jiayu Chen, Vaneet Aggarwal, Tian Lan

TL;DR
This paper introduces ODPP, a unified algorithm for unsupervised skill discovery in reinforcement learning that balances diversity and coverage of options using Determinantal Point Process, outperforming existing methods.
Contribution
The paper presents a novel framework unifying diversity and coverage in unsupervised skill discovery via DPP, with an effective algorithm called ODPP.
Findings
ODPP outperforms state-of-the-art baselines in Mujoco and Atari tasks.
The method effectively balances diversity and coverage in skill discovery.
Extensive evaluations demonstrate the superiority of ODPP over existing approaches.
Abstract
Learning rich skills under the option framework without supervision of external rewards is at the frontier of reinforcement learning research. Existing works mainly fall into two distinctive categories: variational option discovery that maximizes the diversity of the options through a mutual information loss (while ignoring coverage) and Laplacian-based methods that focus on improving the coverage of options by increasing connectivity of the state space (while ignoring diversity). In this paper, we show that diversity and coverage in unsupervised option discovery can indeed be unified under the same mathematical framework. To be specific, we explicitly quantify the diversity and coverage of the learned options through a novel use of Determinantal Point Process (DPP) and optimize these objectives to discover options with both superior diversity and coverage. Our proposed algorithm, ODPP,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsOptimization and Search Problems · Reinforcement Learning in Robotics
