Variational Option Discovery Algorithms
Joshua Achiam, Harrison Edwards, Dario Amodei, Pieter Abbeel

TL;DR
This paper introduces VALOR, a variational autoencoder-inspired method for option discovery in reinforcement learning, along with a curriculum learning strategy to enhance training stability and diversity of learned behaviors.
Contribution
The paper establishes a connection between variational option discovery and autoencoders, proposing VALOR and a curriculum approach to improve learning of diverse options.
Findings
VALOR effectively encodes contexts into trajectories and recovers them.
Curriculum learning stabilizes training and enables learning more behavioral modes.
The approach has implications for downstream tasks and understanding limitations.
Abstract
We explore methods for option discovery based on variational inference and make two algorithmic contributions. First: we highlight a tight connection between variational option discovery methods and variational autoencoders, and introduce Variational Autoencoding Learning of Options by Reinforcement (VALOR), a new method derived from the connection. In VALOR, the policy encodes contexts from a noise distribution into trajectories, and the decoder recovers the contexts from the complete trajectories. Second: we propose a curriculum learning approach where the number of contexts seen by the agent increases whenever the agent's performance is strong enough (as measured by the decoder) on the current set of contexts. We show that this simple trick stabilizes training for VALOR and prior variational option discovery methods, allowing a single agent to learn many more modes of behavior than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Reservoir Engineering and Simulation Methods
