VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation
Tairan He, Zi Wang, Haoru Xue, Qingwei Ben, Zhengyi Luo, Wenli Xiao, Ye Yuan, Xingye Da, Fernando Casta\~neda, Shankar Sastry, Changliu Liu, Guanya Shi, Linxi Fan, and Yuke Zhu

TL;DR
VIRAL is a scalable visual sim-to-real framework enabling humanoid robots to learn loco-manipulation skills entirely in simulation and deploy them zero-shot in real-world scenarios, achieving expert-level performance.
Contribution
The paper introduces VIRAL, a novel large-scale visual sim-to-real approach with a teacher-student design, enabling zero-shot transfer of humanoid loco-manipulation skills.
Findings
Scaling simulation to 64 GPUs improves training reliability.
VIRAL achieves up to 54 continuous loco-manipulation cycles in real-world deployment.
RGB-based policy generalizes across diverse environments without fine-tuning.
Abstract
A key barrier to the real-world deployment of humanoid robots is the lack of autonomous loco-manipulation skills. We introduce VIRAL, a visual sim-to-real framework that learns humanoid loco-manipulation entirely in simulation and deploys it zero-shot to real hardware. VIRAL follows a teacher-student design: a privileged RL teacher, operating on full state, learns long-horizon loco-manipulation using a delta action space and reference state initialization. A vision-based student policy is then distilled from the teacher via large-scale simulation with tiled rendering, trained with a mixture of online DAgger and behavior cloning. We find that compute scale is critical: scaling simulation to tens of GPUs (up to 64) makes both teacher and student training reliable, while low-compute regimes often fail. To bridge the sim-to-real gap, VIRAL combines large-scale visual domain randomization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Locomotion and Control · Social Robot Interaction and HRI
