Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi

TL;DR
Splatter Image introduces a fast, monocular 3D reconstruction method using Gaussian Splatting, enabling real-time performance and high-quality results from single or multiple images with a simple neural network design.
Contribution
The paper presents a novel, straightforward neural network approach that maps 2D image pixels to 3D Gaussians for ultra-fast monocular 3D reconstruction using Gaussian Splatting.
Findings
Achieves 38 FPS for monocular reconstruction.
Outperforms prior methods on multiple benchmarks.
Reconstructs high-quality 3D scenes efficiently.
Abstract
We introduce the \method, an ultra-efficient approach for monocular 3D object reconstruction. Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images. We apply Gaussian Splatting to monocular reconstruction by learning a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS. Our main innovation is the surprisingly straightforward design of this network, which, using 2D operators, maps the input image to one 3D Gaussian per pixel. The resulting set of Gaussians thus has the form an image, the Splatter Image. We further extend the method take several images as input via cross-view attention. Owning to the speed of the renderer (588 FPS), we use a single GPU for training while generating entire images at each iteration to optimize perceptual metrics like LPIPS. On several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Image Processing Techniques and Applications
MethodsSparse Evolutionary Training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
