IDOL: Instant Photorealistic 3D Human Creation from a Single Image
Yiyu Zhuang, Jiaxi Lv, Hao Wen, Qing Shuai, Ailing Zeng, Hao Zhu,, Shifeng Chen, Yujiu Yang, Xun Cao, Wei Liu

TL;DR
IDOL introduces a scalable transformer-based method trained on a large synthetic dataset to instantly generate high-resolution, photorealistic 3D human avatars from a single image, enabling real-time animation and editing.
Contribution
The paper presents HuGe100K, a large-scale synthetic dataset, and a transformer model that predicts a 3D Gaussian representation for photorealistic human reconstruction from a single image.
Findings
Reconstructs 3D humans at 1K resolution instantly on a single GPU.
Demonstrates high-quality, photorealistic avatars with pose and shape editing capabilities.
Validates effectiveness through comprehensive experiments.
Abstract
Creating a high-fidelity, animatable 3D full-body avatar from a single image is a challenging task due to the diverse appearance and poses of humans and the limited availability of high-quality training data. To achieve fast and high-quality human reconstruction, this work rethinks the task from the perspectives of dataset, model, and representation. First, we introduce a large-scale HUman-centric GEnerated dataset, HuGe100K, consisting of 100K diverse, photorealistic sets of human images. Each set contains 24-view frames in specific human poses, generated using a pose-controllable image-to-multi-view model. Next, leveraging the diversity in views, poses, and appearances within HuGe100K, we develop a scalable feed-forward transformer model to predict a 3D human Gaussian representation in a uniform space from a given human image. This model is trained to disentangle human pose, body…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Motion and Animation
MethodsSparse Evolutionary Training
