Towards Geometry-Aware and Motion-Guided Video Human Mesh Recovery
Hongjun Chen, Huan Zheng, Wencheng Han, Jianbing Shen

TL;DR
This paper introduces HMRMamba, a novel video-based 3D human mesh recovery framework that uses structured state space models and geometry-aware modules to improve accuracy, temporal consistency, and efficiency in challenging scenarios.
Contribution
The paper proposes a new paradigm for HMR using SSMs, with a geometry-aware lifting module and a motion-guided reconstruction network for enhanced 3D mesh recovery.
Findings
Sets new state-of-the-art on 3DPW, MPI-INF-3DHP, and Human3.6M benchmarks.
Improves reconstruction accuracy and temporal consistency.
Offers superior computational efficiency.
Abstract
Existing video-based 3D Human Mesh Recovery (HMR) methods often produce physically implausible results, stemming from their reliance on flawed intermediate 3D pose anchors and their inability to effectively model complex spatiotemporal dynamics. To overcome these deep-rooted architectural problems, we introduce HMRMamba, a new paradigm for HMR that pioneers the use of Structured State Space Models (SSMs) for their efficiency and long-range modeling prowess. Our framework is distinguished by two core contributions. First, the Geometry-Aware Lifting Module, featuring a novel dual-scan Mamba architecture, creates a robust foundation for reconstruction. It directly grounds the 2D-to-3D pose lifting process with geometric cues from image features, producing a highly reliable 3D pose sequence that serves as a stable anchor. Second, the Motion-guided Reconstruction Network leverages this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Human Motion and Animation
