Do's and Don'ts: Learning Desirable Skills with Instruction Videos
Hyunseung Kim, Byungkun Lee, Hojoon Lee, Dongyoon Hwang, Donghu Kim,, Jaegul Choo

TL;DR
This paper introduces DoDont, a two-stage instruction-based skill discovery method that uses videos to guide learning towards desirable behaviors and away from unsafe or undesirable ones in continuous control tasks.
Contribution
The paper proposes a novel instruction-based approach that leverages videos to improve unsupervised skill discovery, emphasizing safety and desirability in learned behaviors.
Findings
Effective learning of desirable behaviors with less than 8 videos
Avoidance of unsafe behaviors like tripping and rolling
Successful application to complex control tasks
Abstract
Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle with learning more complex movements such as walking and running. Moreover, they may acquire unsafe behaviors like tripping and rolling or navigate to undesirable locations such as pitfalls or hazardous areas. In response, we present DoDont (Do's and Don'ts), an instruction-based skill discovery algorithm composed of two stages. First, in an instruction learning stage, DoDont leverages action-free instruction videos to train an instruction network to distinguish desirable transitions from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovations in Educational Methods
