GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

GigaBrain Team: Boyuan Wang; Bohan Li; Chaojun Ni; Guan Huang; Guosheng Zhao; Hao Li; Jie Li; Jindi Lv; Jingyu Liu; Lv Feng; Mingming Yu; Peng Li; Qiuping Deng; Tianze Liu; Xinyu Zhou; Xinze Chen; Xiaofeng Wang; Yang Wang; Yifan Li; Yifei Nie; Yilong Li; Yukun Zhou; Yun Ye; Zhichao Liu; Zheng Zhu

arXiv:2602.12099·cs.CV·February 27, 2026

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

GigaBrain Team: Boyuan Wang, Bohan Li, Chaojun Ni, Guan Huang, Guosheng Zhao, Hao Li, Jie Li, Jindi Lv, Jingyu Liu, Lv Feng, Mingming Yu, Peng Li, Qiuping Deng, Tianze Liu, Xinyu Zhou, Xinze Chen, Xiaofeng Wang, Yang Wang, Yifan Li, Yifei Nie, Yilong Li, Yukun Zhou, Yun Ye

PDF

Open Access

TL;DR

GigaBrain-0.5M* is a vision-language-action model that leverages world model-based reinforcement learning to improve multi-step action prediction, demonstrating significant performance gains and reliable long-horizon task execution in robotic manipulation.

Contribution

The paper introduces GigaBrain-0.5M*, integrating world model-based reinforcement learning with a pre-trained VLA model for enhanced cross-task robotic manipulation.

Findings

01

30% performance improvement on complex tasks

02

Reliable long-horizon task execution in real-world settings

03

Effective cross-task adaptation via RAMP

Abstract

Vision-language-action (VLA) models that directly predict multi-step action chunks from current observations face inherent limitations due to constrained scene understanding and weak future anticipation capabilities. In contrast, video world models pre-trained on web-scale video corpora exhibit robust spatiotemporal reasoning and accurate future prediction, making them a natural foundation for enhancing VLA learning. Therefore, we propose \textit{GigaBrain-0.5M*}, a VLA model trained via world model-based reinforcement learning. Built upon \textit{GigaBrain-0.5}, which is pre-trained on over 10,000 hours of robotic manipulation data, whose intermediate version currently ranks first on the international RoboChallenge benchmark. \textit{GigaBrain-0.5M*} further integrates world model-based reinforcement learning via \textit{RAMP} (Reinforcement leArning via world Model-conditioned Policy)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning