LaSSM: Efficient Semantic-Spatial Query Decoding via Local Aggregation and State Space Models for 3D Instance Segmentation
Lei Yao, Yi Wang, Yawen Cui, Moyun Liu, and Lap-Pui Chau

TL;DR
LaSSM introduces a simple, efficient, and competitive approach for 3D scene instance segmentation from point clouds, utilizing hierarchical semantic-spatial initialization and a local aggregation state space decoder to improve accuracy and reduce computation.
Contribution
The paper proposes LaSSM, a novel method combining hierarchical semantic-spatial query initialization with a coordinate-guided state space decoder for efficient 3D instance segmentation.
Findings
Ranks first on ScanNet++ V2 leaderboard with 2.5% higher mAP
Achieves only one-third of the FLOPs compared to previous methods
Performs competitively on multiple 3D scene segmentation benchmarks
Abstract
Query-based 3D scene instance segmentation from point clouds has attained notable performance. However, existing methods suffer from the query initialization dilemma due to the sparse nature of point clouds and rely on computationally intensive attention mechanisms in query decoders. We accordingly introduce LaSSM, prioritizing simplicity and efficiency while maintaining competitive performance. Specifically, we propose a hierarchical semantic-spatial query initializer to derive the query set from superpoints by considering both semantic cues and spatial distribution, achieving comprehensive scene coverage and accelerated convergence. We further present a coordinate-guided state space model (SSM) decoder that progressively refines queries. The novel decoder features a local aggregation scheme that restricts the model to focus on geometrically coherent regions and a spatial dual-path SSM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
