InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for   3D Instance-Level Perception

Haijie Li; Yanmin Wu; Jiarui Meng; Qiankun Gao; Zhiyao Zhang; Ronggang; Wang; Jian Zhang

arXiv:2411.19235·cs.CV·April 16, 2025

InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Haijie Li, Yanmin Wu, Jiarui Meng, Qiankun Gao, Zhiyao Zhang, Ronggang, Wang, Jian Zhang

PDF

Open Access

TL;DR

InstanceGaussian introduces a joint appearance-semantic Gaussian representation for 3D scene understanding, effectively balancing and integrating features to improve instance segmentation accuracy in complex 3D environments.

Contribution

It proposes a novel joint appearance-semantic Gaussian representation, a progressive training strategy, and a bottom-up instance aggregation method for enhanced 3D scene segmentation.

Findings

01

Achieves state-of-the-art results in open-vocabulary 3D point-level segmentation.

02

Effectively balances appearance and semantic features for better boundary delineation.

03

Addresses segmentation challenges with a category-agnostic, bottom-up approach.

Abstract

3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality. Recently, 3D Gaussian Splatting (3DGS) has emerged as a powerful approach, combining explicit modeling with neural adaptability to provide efficient and detailed scene representations. However, three major challenges remain in leveraging 3DGS for scene understanding: 1) an imbalance between appearance and semantics, where dense Gaussian usage for fine-grained texture modeling does not align with the minimal requirements for semantic attributes; 2) inconsistencies between appearance and semantics, as purely appearance-based Gaussians often misrepresent object boundaries; and 3) reliance on top-down instance segmentation methods, which struggle with uneven category distributions, leading to over- or under-segmentation. In this work, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Face recognition and analysis · Human Pose and Action Recognition

MethodsALIGN