Intrinsic Geometry-Appearance Consistency Optimization for Sparse-View Gaussian Splatting
Kaiqiang Xiong, Rui Peng, Jiahao Wu, Zhanke Wang, Jie Liang, Xiaoyun Zheng, Feng Gao, Ronggang Wang

TL;DR
This paper introduces MVD-HuGaS, a novel method for 3D human reconstruction from a single image using a multi-view diffusion model, joint optimization of camera poses, and facial refinement, achieving state-of-the-art results.
Contribution
The work presents a multi-view diffusion model fine-tuned on 3D datasets, an alignment module for camera pose estimation, and a facial distortion mitigation technique for improved 3D human reconstruction.
Findings
Achieves state-of-the-art performance on Thuman2.0 and 2K2K datasets.
Effectively refines facial regions for higher fidelity.
Enables high-quality free-view 3D human rendering from a single image.
Abstract
3D human reconstruction from a single image is a challenging problem and has been exclusively studied in the literature. Recently, some methods have resorted to diffusion models for guidance, optimizing a 3D representation via Score Distillation Sampling(SDS) or generating a back-view image for facilitating reconstruction. However, these methods tend to produce unsatisfactory artifacts (\textit{e.g.} flattened human structure or over-smoothing results caused by inconsistent priors from multiple views) and struggle with real-world generalization in the wild. In this work, we present \emph{MVD-HuGaS}, enabling free-view 3D human rendering from a single image via a multi-view human diffusion model. We first generate multi-view images from the single reference image with an enhanced multi-view diffusion model, which is well fine-tuned on high-quality 3D human datasets to incorporate 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
