Monocular Models are Strong Learners for Multi-View Human Mesh Recovery

Haoyu Xie; Shengkai Xu; Cheng Guo; Muhammad Usama Saleem; Wenhan Wu; Chen Chen; Ahmed Helmy; Pu Wang; Hongfei Xue

arXiv:2603.20391·cs.CV·April 2, 2026

Monocular Models are Strong Learners for Multi-View Human Mesh Recovery

Haoyu Xie, Shengkai Xu, Cheng Guo, Muhammad Usama Saleem, Wenhan Wu, Chen Chen, Ahmed Helmy, Pu Wang, Hongfei Xue

PDF

TL;DR

This paper introduces a training-free, multi-view human mesh recovery framework that leverages single-view models and test-time optimization to achieve state-of-the-art results without multi-view training data.

Contribution

It proposes a novel calibration-free approach that uses pretrained single-view models and test-time optimization for multi-view human mesh recovery.

Findings

01

Achieves state-of-the-art performance on standard benchmarks.

02

Outperforms models trained with explicit multi-view supervision.

03

Eliminates the need for multi-view training data.

Abstract

Multi-view human mesh recovery (HMR) is broadly deployed in diverse domains where high accuracy and strong generalization are essential. Existing approaches can be broadly grouped into geometry-based and learning-based methods. However, geometry-based methods (e.g., triangulation) rely on cumbersome camera calibration, while learning-based approaches often generalize poorly to unseen camera configurations due to the lack of multi-view training data, limiting their performance in real-world scenarios. To enable calibration-free reconstruction that generalizes to arbitrary camera setups, we propose a training-free framework that leverages pretrained single-view HMR models as strong priors, eliminating the need for multi-view training data. Our method first constructs a robust and consistent multi-view initialization from single-view predictions, and then refines it via test-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.