InpaintHuman: Reconstructing Occluded Humans with Multi-Scale UV Mapping and Identity-Preserving Diffusion Inpainting

Jinlong Fan; Shanshan Zhao; Liang Zheng; Jing Zhang; Yuxiang Yang; Mingming Gong

arXiv:2601.02098·cs.CV·January 6, 2026

InpaintHuman: Reconstructing Occluded Humans with Multi-Scale UV Mapping and Identity-Preserving Diffusion Inpainting

Jinlong Fan, Shanshan Zhao, Liang Zheng, Jing Zhang, Yuxiang Yang, Mingming Gong

PDF

Open Access

TL;DR

InpaintHuman is a novel method that reconstructs complete, high-fidelity 3D human avatars from occluded monocular videos by combining multi-scale UV mapping with identity-preserving diffusion inpainting, ensuring geometric accuracy and temporal coherence.

Contribution

The paper introduces a multi-scale UV representation and an identity-preserving diffusion inpainting module, advancing occluded human reconstruction with improved detail preservation and identity consistency.

Findings

01

Achieves high-quality, complete 3D human reconstructions from occluded monocular videos.

02

Demonstrates superior performance on synthetic and real-world benchmarks.

03

Ensures temporal coherence and identity preservation in reconstructed avatars.

Abstract

Reconstructing complete and animatable 3D human avatars from monocular videos remains challenging, particularly under severe occlusions. While 3D Gaussian Splatting has enabled photorealistic human rendering, existing methods struggle with incomplete observations, often producing corrupted geometry and temporal inconsistencies. We present InpaintHuman, a novel method for generating high-fidelity, complete, and animatable avatars from occluded monocular videos. Our approach introduces two key innovations: (i) a multi-scale UV-parameterized representation with hierarchical coarse-to-fine feature interpolation, enabling robust reconstruction of occluded regions while preserving geometric details; and (ii) an identity-preserving diffusion inpainting module that integrates textual inversion with semantic-conditioned guidance for subject-specific, temporally coherent completion. Unlike…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation