PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance
Lei Pan, Wuyang Luan, Yuan Zheng, Qiang Fu, Junhui Li

TL;DR
PSGformer is a novel 3D instance segmentation network that leverages multi-level semantic aggregation and parallel feature fusion transformers to improve accuracy by effectively capturing and integrating global and local scene features.
Contribution
The paper introduces PSGformer, which employs innovative semantic aggregation and transformer-based feature fusion to enhance 3D instance segmentation performance.
Findings
Outperforms state-of-the-art by 2.2% mAP on ScanNetv2
Effectively captures detailed semantic information from multiple scales
Achieves superior feature representation through transformer modules
Abstract
Most existing 3D instance segmentation methods are derived from 3D semantic segmentation models. However, these indirect approaches suffer from certain limitations. They fail to fully leverage global and local semantic information for accurate prediction, which hampers the overall performance of the 3D instance segmentation framework. To address these issues, this paper presents PSGformer, a novel 3D instance segmentation network. PSGformer incorporates two key advancements to enhance the performance of 3D instance segmentation. Firstly, we propose a Multi-Level Semantic Aggregation Module, which effectively captures scene features by employing foreground point filtering and multi-radius aggregation. This module enables the acquisition of more detailed semantic information from global and local perspectives. Secondly, PSGformer introduces a Parallel Feature Fusion Transformer Module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Remote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage
MethodsMulti-Head Attention · Attention Is All You Need · fail · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Absolute Position Encodings · Adam
