InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception
Haijie Li, Yanmin Wu, Jiarui Meng, Qiankun Gao, Zhiyao Zhang, Ronggang, Wang, Jian Zhang

TL;DR
InstanceGaussian introduces a joint appearance-semantic Gaussian representation for 3D scene understanding, effectively balancing and integrating features to improve instance segmentation accuracy in complex 3D environments.
Contribution
It proposes a novel joint appearance-semantic Gaussian representation, a progressive training strategy, and a bottom-up instance aggregation method for enhanced 3D scene segmentation.
Findings
Achieves state-of-the-art results in open-vocabulary 3D point-level segmentation.
Effectively balances appearance and semantic features for better boundary delineation.
Addresses segmentation challenges with a category-agnostic, bottom-up approach.
Abstract
3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality. Recently, 3D Gaussian Splatting (3DGS) has emerged as a powerful approach, combining explicit modeling with neural adaptability to provide efficient and detailed scene representations. However, three major challenges remain in leveraging 3DGS for scene understanding: 1) an imbalance between appearance and semantics, where dense Gaussian usage for fine-grained texture modeling does not align with the minimal requirements for semantic attributes; 2) inconsistencies between appearance and semantics, as purely appearance-based Gaussians often misrepresent object boundaries; and 3) reliance on top-down instance segmentation methods, which struggle with uneven category distributions, leading to over- or under-segmentation. In this work, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Face recognition and analysis · Human Pose and Action Recognition
MethodsALIGN
