Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space
Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine

TL;DR
This paper introduces Planning to Practice (PTP), a hierarchical goal-conditioned reinforcement learning method that combines offline pre-training and online fine-tuning with subgoal planning in latent space, enabling efficient learning of complex, long-horizon tasks.
Contribution
It proposes a novel hierarchical approach with latent space subgoal generation and hybrid offline-online training, improving efficiency in training goal-conditioned policies for complex tasks.
Findings
PTP effectively decomposes complex tasks into manageable subgoals.
The method achieves successful transfer from offline data to real-world tasks.
Experimental results demonstrate improved efficiency and feasibility in long-horizon task learning.
Abstract
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach configurable goals for a wide range of tasks on command. However, such goal-conditioned policies are notoriously difficult and time-consuming to train from scratch. In this paper, we propose Planning to Practice (PTP), a method that makes it practical to train goal-conditioned policies for long-horizon tasks that require multiple distinct types of interactions to solve. Our approach is based on two key ideas. First, we decompose the goal-reaching problem hierarchically, with a high-level planner that sets intermediate subgoals using conditional subgoal generators in the latent space for a low-level model-free policy. Second, we propose a hybrid approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification
