Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for   Unconstrained Photo Collections

Congrong Xu; Justin Kerr; Angjoo Kanazawa

arXiv:2407.12306·cs.CV·October 1, 2024·2 cites

Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections

Congrong Xu, Justin Kerr, Angjoo Kanazawa

PDF

Open Access

TL;DR

Splatfacto-W introduces a novel method combining Gaussian Splatting with appearance embeddings and background modeling to enable real-time, high-quality view synthesis from unconstrained photo collections, addressing challenges of photometric variations.

Contribution

It presents Splatfacto-W, integrating per-Gaussian neural features and appearance embeddings into Gaussian Splatting for improved scene reconstruction in wild image datasets.

Findings

01

PSNR improved by 5.3 dB over 3DGS

02

Training speed increased by 150 times over NeRF

03

Achieves real-time rendering with high scene consistency

Abstract

Novel view synthesis from unconstrained in-the-wild image collections remains a significant yet challenging task due to photometric variations and transient occluders that complicate accurate scene reconstruction. Previous methods have approached these issues by integrating per-image appearance features embeddings in Neural Radiance Fields (NeRFs). Although 3D Gaussian Splatting (3DGS) offers faster training and real-time rendering, adapting it for unconstrained image collections is non-trivial due to the substantially different architecture. In this paper, we introduce Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds. Our key contributions include latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings