IDSplat: Instance-Decomposed 3D Gaussian Splatting for Driving Scenes
Carl Lindstr\"om, Mahan Rafidashti, Maryam Fatemi, Lars Hammarstrand, Martin R. Oswald, Lennart Svensson

TL;DR
IDSplat is a self-supervised 3D Gaussian Splatting method that explicitly decomposes dynamic driving scenes into instances with learnable trajectories, enabling realistic, annotation-free scene reconstruction for autonomous driving.
Contribution
It introduces a novel instance decomposition approach using language-grounded tracking and a coordinated-turn smoothing scheme, without relying on human annotations.
Findings
Achieves competitive reconstruction quality on Waymo dataset
Maintains instance-level decomposition across diverse sequences
Generalizes without retraining to different view densities
Abstract
Reconstructing dynamic driving scenes is essential for developing autonomous systems through sensor-realistic simulation. Although recent methods achieve high-fidelity reconstructions, they either rely on costly human annotations for object trajectories or use time-varying representations without explicit object-level decomposition, leading to intertwined static and dynamic elements that hinder scene separation. We present IDSplat, a self-supervised 3D Gaussian Splatting framework that reconstructs dynamic scenes with explicit instance decomposition and learnable motion trajectories, without requiring human annotations. Our key insight is to model dynamic objects as coherent instances undergoing rigid transformations, rather than unstructured time-varying primitives. For instance decomposition, we employ zero-shot, language-grounded video tracking anchored to 3D using lidar, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
