Inst4DGS: Instance-Decomposed 4D Gaussian Splatting with Multi-Video Label Permutation Learning
Yonghan Lee, Dinesh Manocha

TL;DR
Inst4DGS introduces a novel method for 4D Gaussian Splatting that effectively associates multi-view video instances over time, enabling high-quality rendering and segmentation with stable identities.
Contribution
The paper proposes per-video label-permutation latents with a differentiable Sinkhorn layer for cross-view instance matching and motion scaffolds for efficient long-horizon trajectory optimization.
Findings
Achieves state-of-the-art rendering quality with PSNR 28.36.
Significantly improves instance segmentation mIoU to 0.9129.
Supports joint tracking and instance decomposition.
Abstract
We present Inst4DGS, an instance-decomposed 4D Gaussian Splatting (4DGS) approach with long-horizon per-Gaussian trajectories. While dynamic 4DGS has advanced rapidly, instance-decomposed 4DGS remains underexplored, largely due to the difficulty of associating inconsistent instance labels across independently segmented multi-view videos. We address this challenge by introducing per-video label-permutation latents that learn cross-video instance matches through a differentiable Sinkhorn layer, enabling direct multi-view supervision with consistent identity preservation. This explicit label alignment yields sharp decision boundaries and temporally stable identities without identity drift. To further improve efficiency, we propose instance-decomposed motion scaffolds that provide low-dimensional motion bases per object for long-horizon trajectory optimization. Experiments on Panoptic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Human Pose and Action Recognition
