SplatSDF: Boosting SDF-NeRF via Architecture-Level Fusion with Gaussian Splats
Runfa Blark Li, Keito Suzuki, Bang Du, Ki Myung Brian Lee, Nikolay Atanasov, Truong Nguyen

TL;DR
SplatSDF introduces an architecture-level fusion of Gaussian splats into SDF-NeRF, significantly accelerating training convergence and improving geometric accuracy for practical robotic applications.
Contribution
The paper presents a novel fusion strategy that integrates 3D Gaussian splats directly into SDF-NeRF architecture, enabling faster training and better performance.
Findings
Achieves 3X faster convergence than baseline methods.
Outperforms state-of-the-art SDF-NeRF in geometric accuracy.
Accelerates gradient and Hessian computations by 3X.
Abstract
Signed distance-radiance field (SDF-NeRF) is a promising environment representation that offers both photo-realistic rendering and geometric reasoning such as proximity queries for collision avoidance. However, the slow training speed and convergence of SDF-NeRF hinder their use in practical robotic systems. We propose SplatSDF, a novel SDF-NeRF architecture that accelerates convergence using 3D Gaussian splats (3DGS), which can be quickly pre-trained. Unlike prior approaches that introduce a consistency loss between separate 3DGS and SDF-NeRF models, SplatSDF directly fuses 3DGS at an architectural level by consuming it as an input to SDF-NeRF during training. This is achieved using a novel sparse 3DGS fusion strategy that injects neural embeddings of 3DGS into SDF-NeRF around the object surface, while also permitting inference without 3DGS for minimal operation. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
