RoboHorizon: An LLM-Assisted Multi-View World Model for Long-Horizon Robotic Manipulation
Zixuan Chen, Jing Huo, Yangtao Chen, Yang Gao

TL;DR
RoboHorizon introduces an LLM-assisted multi-view world model that enhances long-horizon robotic manipulation by improving task recognition, perception, and planning, leading to significant performance gains in benchmark environments.
Contribution
The paper presents RoboHorizon, a novel multi-view world model leveraging LLMs and keyframe discovery to improve long-horizon robotic manipulation tasks.
Findings
Achieved 23.35% higher success rate on RLBench short-horizon tasks.
Achieved 29.23% higher success rate on long-horizon and furniture assembly tasks.
Outperforms state-of-the-art visual model-based RL methods.
Abstract
Efficient control in long-horizon robotic manipulation is challenging due to complex representation and policy learning requirements. Model-based visual reinforcement learning (RL) has shown great potential in addressing these challenges but still faces notable limitations, particularly in handling sparse rewards and complex visual features in long-horizon environments. To address these limitations, we propose the Recognize-Sense-Plan-Act (RSPA) pipeline for long-horizon tasks and further introduce RoboHorizon, an LLM-assisted multi-view world model tailored for long-horizon robotic manipulation. In RoboHorizon, pre-trained LLMs generate dense reward structures for multi-stage sub-tasks based on task language instructions, enabling robots to better recognize long-horizon tasks. Keyframe discovery is then integrated into the multi-view masked autoencoder (MAE) architecture to enhance the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Robotic Path Planning Algorithms · Robotics and Automated Systems
