Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People
Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton

TL;DR
This paper introduces a new dataset and a multi-view loss function to significantly improve the accuracy and completeness of single-image 3D human reconstruction, especially for clothed people.
Contribution
The paper presents a synthetic dataset of clothed humans and a novel multi-view loss for monocular shape estimation, enhancing reconstruction quality.
Findings
Outperforms previous state-of-the-art methods
Achieves higher accuracy and completeness in reconstructions
Demonstrates effective transfer from synthetic to real images
Abstract
We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image. Recent work has introduced volumetric, implicit and model-based shape learning frameworks for reconstruction of objects and people from one or more images. However, the accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint. This paper introduces two advances to overcome this limitation: firstly a new synthetic dataset of realistic clothed people, 3DVH; and secondly, a novel multiple-view loss function for training of monocular volumetric shape estimation, which is demonstrated to significantly improve generalisation and reconstruction accuracy. The 3DVH dataset of realistic clothed 3D human models rendered with diverse natural backgrounds is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
