Distilling a Hierarchical Policy for Planning and Control via   Representation and Reinforcement Learning

Jung-Su Ha; Young-Jin Park; Hyeok-Joo Chae; Soon-Seo Park; Han-Lim; Choi

arXiv:2011.08345·cs.LG·April 7, 2021

Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Jung-Su Ha, Young-Jin Park, Hyeok-Joo Chae, Soon-Seo Park, Han-Lim, Choi

PDF

Open Access

TL;DR

This paper introduces DISH, a hierarchical policy framework that uses representation and reinforcement learning to enable flexible task adaptation and planning in high-dimensional environments by operating in low-dimensional latent spaces.

Contribution

DISH is a novel hierarchical policy distillation method that leverages latent variable models for flexible planning and control across multiple tasks without retraining.

Findings

01

Learned compact latent representations for complex humanoid control

02

Policy can adapt to new tasks like navigation without retraining

03

Effective in high-dimensional observation and action spaces

Abstract

We present a hierarchical planning and control framework that enables an agent to perform various tasks and adapt to a new task flexibly. Rather than learning an individual policy for each particular task, the proposed framework, DISH, distills a hierarchical policy from a set of tasks by representation and reinforcement learning. The framework is based on the idea of latent variable models that represent high-dimensional observations using low-dimensional latent variables. The resulting policy consists of two levels of hierarchy: (i) a planning module that reasons a sequence of latent intentions that would lead to an optimistic future and (ii) a feedback control policy, shared across the tasks, that executes the inferred intention. Because the planning is performed in low-dimensional latent space, the learned policy can immediately be used to solve or adapt to new tasks without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Zebrafish Biomedical Research Applications · Robotic Locomotion and Control