Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
Matthew Strong, Boshu Lei, Aiden Swann, Wen Jiang, Kostas Daniilidis,, Monroe Kennedy III

TL;DR
This paper introduces an active view and touch selection framework for robotic scene understanding using 3D Gaussian Splatting, improving performance in limited-view scenarios through novel training and selection methods.
Contribution
It presents an end-to-end online training pipeline with a new semantic depth alignment and extends FisherRF for active view and touch selection in 3DGS.
Findings
Enhanced 3D scene reconstruction in few-view settings
Improved view and touch pose selection accuracy
Demonstrated real-time performance on a robotic system
Abstract
We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit 3D scene representation for robotics, as it has the ability to represent scenes in a both photorealistic and geometrically accurate manner. However, in real-world, online robotic scenes where the number of views is limited given efficiency requirements, random view selection for 3DGS becomes impractical as views are often overlapping and redundant. We address this issue by proposing an end-to-end online training and active view selection pipeline, which enhances the performance of 3DGS in few-view robotics settings. We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method using Segment Anything Model 2 (SAM2) that we supplement with Pearson depth and surface normal loss to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Augmented Reality Applications · Robotics and Sensor-Based Localization
