Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in   Latent Space

Kuan Fang; Patrick Yin; Ashvin Nair; Sergey Levine

arXiv:2205.08129·cs.RO·April 19, 2023

Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space

Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine

PDF

Open Access

TL;DR

This paper introduces Planning to Practice (PTP), a hierarchical goal-conditioned reinforcement learning method that combines offline pre-training and online fine-tuning with subgoal planning in latent space, enabling efficient learning of complex, long-horizon tasks.

Contribution

It proposes a novel hierarchical approach with latent space subgoal generation and hybrid offline-online training, improving efficiency in training goal-conditioned policies for complex tasks.

Findings

01

PTP effectively decomposes complex tasks into manageable subgoals.

02

The method achieves successful transfer from offline data to real-world tasks.

03

Experimental results demonstrate improved efficiency and feasibility in long-horizon task learning.

Abstract

General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach configurable goals for a wide range of tasks on command. However, such goal-conditioned policies are notoriously difficult and time-consuming to train from scratch. In this paper, we propose Planning to Practice (PTP), a method that makes it practical to train goal-conditioned policies for long-horizon tasks that require multiple distinct types of interactions to solve. Our approach is based on two key ideas. First, we decompose the goal-reaching problem hierarchically, with a high-level planner that sets intermediate subgoals using conditional subgoal generators in the latent space for a low-level model-free policy. Second, we propose a hybrid approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification