Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds

Jia Lu; Taoran Yi; Jiemin Fang; Chen Yang; Chuiyun Wu; Wei Shen; Wenyu Liu; Qi Tian; Xinggang Wang

arXiv:2508.14892·cs.GR·August 21, 2025

Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds

Jia Lu, Taoran Yi, Jiemin Fang, Chen Yang, Chuiyun Wu, Wei Shen, Wenyu Liu, Qi Tian, Xinggang Wang

PDF

Open Access

TL;DR

This paper introduces a fast and efficient method to reconstruct 3D human models from only two images, enabling quick creation of digital humans with high quality and low data requirements.

Contribution

The paper presents a novel approach that reconstructs 3D human bodies from just two images, using a geometry model and enhancement algorithm for color and detail recovery.

Findings

01

Reconstructs human models in 190 ms on a single GPU

02

Achieves state-of-the-art results on THuman2.0 and cross-domain datasets

03

Works effectively with low-cost mobile device images

Abstract

Reconstructing 3D human bodies from sparse views has been an appealing topic, which is crucial to broader the related applications. In this paper, we propose a quite challenging but valuable task to reconstruct the human body from only two images, i.e., the front and back view, which can largely lower the barrier for users to create their own 3D digital humans. The main challenges lie in the difficulty of building 3D consistency and recovering missing information from the highly sparse input. We redesign a geometry reconstruction model based on foundation reconstruction models to predict consistent point clouds even input images have scarce overlaps with extensive human data training. Furthermore, an enhancement algorithm is applied to supplement the missing color information, and then the complete human point clouds with colors can be obtained, which are directly transformed into 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition