Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D   Reconstruction with Transformers

Zi-Xin Zou; Zhipeng Yu; Yuan-Chen Guo; Yangguang Li; Ding Liang,; Yan-Pei Cao; Song-Hai Zhang

arXiv:2312.09147·cs.CV·December 20, 2023·2 cites

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers

Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang,, Yan-Pei Cao, Song-Hai Zhang

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper presents a fast, transformer-based single-view 3D reconstruction method using a hybrid Triplane-Gaussian representation, achieving high quality and speed compared to previous techniques.

Contribution

Introduces a novel hybrid Triplane-Gaussian representation with transformer decoders for efficient, high-quality single-view 3D reconstruction via feed-forward inference.

Findings

01

Faster rendering speed than implicit methods

02

Higher reconstruction quality than explicit methods

03

Effective on both synthetic and real-world images

Abstract

Recent advancements in 3D reconstruction from single images have been driven by the evolution of generative models. Prominent among these are methods based on Score Distillation Sampling (SDS) and the adaptation of diffusion models in the 3D domain. Despite their progress, these techniques often face limitations due to slow optimization or rendering processes, leading to extensive training and optimization times. In this paper, we introduce a novel approach for single-view reconstruction that efficiently generates a 3D model from a single image via feed-forward inference. Our method utilizes two transformer-based networks, namely a point decoder and a triplane decoder, to reconstruct 3D objects using a hybrid Triplane-Gaussian intermediate representation. This hybrid representation strikes a balance, achieving a faster rendering speed compared to implicit representations while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vast-ai-research/triplanegaussian
jax

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion