MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
Xu He, Xiaoyu Li, Di Kang, Jiangnan Ye, Chaopeng Zhang and, Liyang Chen, Xiangjun Gao, Han Zhang, Zhiyong Wu, Haolin Zhuang

TL;DR
MagicMan is a novel human view synthesis method that combines 3D-aware diffusion, multi-view attention, and iterative refinement to produce consistent, high-quality multi-view images from a single reference image, improving 3D reconstruction.
Contribution
It introduces a multi-view diffusion model with hybrid attention, geometry-aware dual branches, and iterative refinement for better 3D human synthesis from a single image.
Findings
Outperforms existing methods in novel view synthesis.
Enhances 3D human reconstruction accuracy.
Achieves high consistency across generated views.
Abstract
Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To tackle the critical challenge of maintaining consistency while achieving dense multi-view generation for improved 3D human reconstruction, we first introduce hybrid multi-view attention to facilitate both efficient and thorough information interchange across different views. Additionally, we present a geometry-aware dual branch to perform concurrent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging
MethodsSoftmax · Attention Is All You Need · Diffusion
