APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen

TL;DR
APART introduces a novel approach for diverse skill discovery in reward-free environments by using all pairs discriminators, a new intrinsic reward, and dropout, achieving comprehensive skill discovery with fewer samples.
Contribution
The paper proposes APART, a new method combining all pairs discriminators, ascending reward, and dropout for effective skill discovery, and explores a simplified variant based on VIC.
Findings
APART discovers all skills in grid worlds with fewer samples.
All pairs discriminator outperforms softmax in skill diversity.
Simplified VIC-based algorithm achieves maximum skills.
Abstract
We study diverse skill discovery in reward-free environments, aiming to discover all possible skills in simple grid-world environments where prior methods have struggled to succeed. This problem is formulated as mutual training of skills using an intrinsic reward and a discriminator trained to predict a skill given its trajectory. Our initial solution replaces the standard one-vs-all (softmax) discriminator with a one-vs-one (all pairs) discriminator and combines it with a novel intrinsic reward function and a dropout regularization technique. The combined approach is named APART: Diverse Skill Discovery using All Pairs with Ascending Reward and Dropout. We demonstrate that APART discovers all the possible skills in grid worlds with remarkably fewer samples than previous works. Motivated by the empirical success of APART, we further investigate an even simpler algorithm that achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Optimization and Search Problems
MethodsDropout · Softmax
