RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Weixin Mao, Weiheng Zhong, Zhou Jiang, Dong Fang, Zhongyue Zhang,, Zihan Lan, Haosheng Li, Fan Jia, Tiancai Wang, Haoqiang Fan, Osamu Yoshie

TL;DR
RoboMatrix introduces a hierarchical, skill-centric framework utilizing large language models and a unified vision-language-action model to enable scalable, generalizable robot task planning and execution in open-world environments.
Contribution
It presents RoboMatrix, the first unified vision-language-action model, and demonstrates skill composition for improved generalization in robot task execution.
Findings
50% higher success rate on unseen tasks
Effective skill composition enables generalization
Unified VLA model integrates movement and manipulation
Abstract
Existing robot policies predominantly adopt the task-centric approach, requiring end-to-end task data collection. This results in limited generalization to new tasks and difficulties in pinpointing errors within long-horizon, multi-stage tasks. To address this, we propose RoboMatrix, a skill-centric hierarchical framework designed for scalable robot task planning and execution in open-world environments. RoboMatrix extracts general meta-skills from diverse complex tasks, enabling the completion of unseen tasks through skill composition. Its architecture consists of a high-level scheduling layer that utilizes large language models (LLMs) for task decomposition, an intermediate skill layer housing meta-skill models, and a low-level hardware layer for robot control. A key innovation of our work is the introduction of the first unified vision-language-action (VLA) model capable of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms
MethodsADaptive gradient method with the OPTimal convergence rate
