WildGaussians: 3D Gaussian Splatting in the Wild
Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, Torsten, Sattler

TL;DR
WildGaussians enhances 3D Gaussian Splatting to effectively handle occlusions, dynamic objects, and illumination changes in in-the-wild scenes, achieving state-of-the-art results with real-time rendering.
Contribution
We introduce WildGaussians, a novel method that integrates appearance modeling with 3DGS using robust features to improve in-the-wild scene reconstruction.
Findings
Outperforms 3DGS and NeRF on in-the-wild data
Maintains real-time rendering speeds
Effectively handles occlusions and appearance changes
Abstract
While the field of 3D scene reconstruction is dominated by NeRFs due to their photorealistic quality, 3D Gaussian Splatting (3DGS) has recently emerged, offering similar quality with real-time rendering speeds. However, both methods primarily excel with well-controlled 3D scenes, while in-the-wild data - characterized by occlusions, dynamic objects, and varying illumination - remains challenging. NeRFs can adapt to such conditions easily through per-image embedding vectors, but 3DGS struggles due to its explicit representation and lack of shared parameters. To address this, we introduce WildGaussians, a novel approach to handle occlusions and appearance changes with 3DGS. By leveraging robust DINO features and integrating an appearance modeling module within 3DGS, our method achieves state-of-the-art results. We demonstrate that WildGaussians matches the real-time rendering speed of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInteractive and Immersive Displays
MethodsSoftmax · Residual Connection · Attention Is All You Need · Layer Normalization · Linear Layer · Dense Connections · Multi-Head Attention · Vision Transformer · self-DIstillation with NO labels · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
