Realistic Clothed Human and Object Joint Reconstruction from a Single Image
Ayushi Dutta, Marco Pesavento, Marco Volino, Adrian Hilton, Armin, Mustafa

TL;DR
This paper presents a novel implicit neural approach for detailed 3D reconstruction of clothed humans and objects from a single image, overcoming occlusion and detail loss issues with attention mechanisms and inpainting.
Contribution
It introduces an implicit representation model conditioned on pose priors and employs a diffusion-based inpainting method to improve detail recovery in human-object reconstruction.
Findings
Outperforms existing methods in realism and detail on synthetic and real datasets.
Effectively handles occlusions with a diffusion inpainting approach.
Captures clothing details better than template-based models.
Abstract
Recent approaches to jointly reconstruct 3D humans and objects from a single RGB image represent 3D shapes with template-based or coarse models, which fail to capture details of loose clothing on human bodies. In this paper, we introduce a novel implicit approach for jointly reconstructing realistic 3D clothed humans and objects from a monocular view. For the first time, we model both the human and the object with an implicit representation, allowing to capture more realistic details such as clothing. This task is extremely challenging due to human-object occlusions and the lack of 3D information in 2D images, often leading to poor detail reconstruction and depth ambiguity. To address these problems, we propose a novel attention-based neural implicit model that leverages image pixel alignment from both the input human-object image for a global understanding of the human-object scene and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
