Efficient Robotic Policy Learning via Latent Space Backward Planning
Dongxiu Liu, Haoyi Niu, Zhihao Wang, Jinliang Zheng, Yinan Zheng, Zhonghong Ou, Jianming Hu, Jianxiong Li, Xianyuan Zhan

TL;DR
This paper introduces a Latent Space Backward Planning (LBP) method for robotic policy learning that improves efficiency and accuracy in long-horizon tasks by grounding goals in latent space and recursively predicting subgoals.
Contribution
The paper proposes a novel LBP scheme that predicts intermediate subgoals in latent space, enabling more efficient and accurate long-term robotic planning compared to existing methods.
Findings
LBP outperforms existing planning methods in simulation and real-robot experiments.
LBP achieves state-of-the-art performance in long-horizon tasks.
Latent space subgoal prediction reduces computational costs and error accumulation.
Abstract
Current robotic planning methods often rely on predicting multi-frame images with full pixel details. While this fine-grained approach can serve as a generic world model, it introduces two significant challenges for downstream policy learning: substantial computational costs that hinder real-time deployment, and accumulated inaccuracies that can mislead action extraction. Planning with coarse-grained subgoals partially alleviates efficiency issues. However, their forward planning schemes can still result in off-task predictions due to accumulation errors, leading to misalignment with long-term goals. This raises a critical question: Can robotic planning be both efficient and accurate enough for real-time control in long-horizon, multi-stage tasks? To address this, we propose a Latent Space Backward Planning scheme (LBP), which begins by grounding the task into final latent goals,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning
MethodsAttentive Walk-Aggregating Graph Neural Network
