Planning with Goal-Conditioned Policies
Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine

TL;DR
This paper introduces a method combining goal-conditioned reinforcement learning with planning, using latent variable models to create state abstractions from complex observations, enabling effective multi-stage decision making in robotics.
Contribution
It proposes a novel approach that integrates RL-learned goal-conditioned policies with planning, utilizing latent variable models to abstract states from high-dimensional inputs like images.
Findings
Outperforms prior methods on image-based robot navigation tasks.
Enables planning with learned policies for complex, multi-stage behaviors.
Provides a scalable way to abstract states from high-dimensional observations.
Abstract
Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors. However, planning requires suitable abstractions for the states and transitions, which typically need to be designed by hand. In contrast, model-free reinforcement learning (RL) can acquire behaviors from low-level inputs directly, but often struggles with temporally extended tasks. Can we utilize reinforcement learning to automatically form the abstractions needed for planning, thus obtaining the best of both approaches? We show that goal-conditioned policies learned with RL can be incorporated into planning, so that a planner can focus on which states to reach, rather than how those states are reached. However, with complex state observations such as images, not all inputs represent valid states. We therefore also propose using a latent variable model to compactly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
