MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction

Qiang Zhang; Jiahao Ma; Peiran Liu; Shuai Shi; Zeran Su; Zifan Wang; Jingkai Sun; Wei Cui; Jialin Yu; Gang Han; Wen Zhao; Pihai Sun; Kangning Yin; Jiaxu Wang; Jiahang Cao; Lingfeng Zhang; Hao Cheng; Xiaoshuai Hao; Yiding Ji; Junwei Liang; Jian Tang; Renjing Xu; Yijie Guo

arXiv:2602.15733·cs.RO·February 18, 2026

MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction

Qiang Zhang, Jiahao Ma, Peiran Liu, Shuai Shi, Zeran Su, Zifan Wang, Jingkai Sun, Wei Cui, Jialin Yu, Gang Han, Wen Zhao, Pihai Sun, Kangning Yin, Jiaxu Wang, Jiahang Cao, Lingfeng Zhang, Hao Cheng, Xiaoshuai Hao, Yiding Ji, Junwei Liang, Jian Tang, Renjing Xu, Yijie Guo

PDF

Open Access

TL;DR

MeshMimic is a novel framework that combines 3D scene reconstruction and motion learning from video to enable humanoid robots to perform complex, terrain-aware behaviors without expensive motion capture data.

Contribution

It introduces a geometry-aware motion learning approach that integrates 3D scene understanding with humanoid control, reducing reliance on costly datasets and improving physical interaction fidelity.

Findings

01

Achieves robust humanoid motion on diverse terrains

02

Utilizes low-cost monocular sensors for training data

03

Demonstrates improved physical interaction accuracy

Abstract

Humanoid motion control has witnessed significant breakthroughs in recent years, with deep reinforcement learning (RL) emerging as a primary catalyst for achieving complex, human-like behaviors. However, the high dimensionality and intricate dynamics of humanoid robots make manual motion design impractical, leading to a heavy reliance on expensive motion capture (MoCap) data. These datasets are not only costly to acquire but also frequently lack the necessary geometric context of the surrounding physical environment. Consequently, existing motion synthesis frameworks often suffer from a decoupling of motion and scene, resulting in physical inconsistencies such as contact slippage or mesh penetration during terrain-aware tasks. In this work, we present MeshMimic, an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Motion and Animation · 3D Shape Modeling and Analysis