Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training

Yaxuan Li; Zhongyi Zhou; Yefei Chen; Yanjiang Guo; Jiaming Liu; Shanghang Zhang; Jianyu Chen; Yichen Zhu

arXiv:2604.21741·cs.RO·May 6, 2026

Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training

Yaxuan Li, Zhongyi Zhou, Yefei Chen, Yanjiang Guo, Jiaming Liu, Shanghang Zhang, Jianyu Chen, Yichen Zhu

PDF

TL;DR

Hi-WM introduces a framework where learned world models enable human-guided corrections in simulation, significantly improving real-world robot manipulation success rates without physical retries.

Contribution

The paper presents Hi-WM, a novel method leveraging world models for scalable, human-in-the-loop post-training correction of robot policies in simulation.

Findings

01

Hi-WM improves real-world success by 37.9 points on average.

02

World-model evaluation correlates strongly with real-world performance (r = 0.953).

03

The approach reduces the need for physical robot resets and supervision.

Abstract

Post-training is essential for turning pretrained generalist robot policies into reliable task-specific controllers, but existing human-in-the-loop pipelines remain tied to physical execution: each correction requires robot time, scene setup, resets, and operator supervision in the real world. Meanwhile, action-conditioned world models have been studied mainly for imagination, synthetic data generation, and policy evaluation. We propose \textbf{Human-in-the-World-Model (Hi-WM)}, a post-training framework that uses a learned world model as a reusable corrective substrate for failure-targeted policy improvement. A policy is first rolled out in closed loop inside the world model; when the rollout becomes incorrect or failure-prone, a human intervenes directly in the model to provide short corrective actions. Hi-WM caches intermediate states and supports rollback and branching, allowing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.