PlaneSAM: Multimodal Plane Instance Segmentation Using the Segment Anything Model
Zhongchen Deng, Zhechen Yang, Chi Chen, Cheng Zeng, Yan Meng, Bisheng, Yang

TL;DR
PlaneSAM is a multimodal plane instance segmentation network that effectively combines RGB and depth data, leveraging a dual-complexity backbone and self-supervised pretraining to achieve state-of-the-art results on RGB-D datasets.
Contribution
The paper introduces PlaneSAM, a novel multimodal segmentation method that integrates RGB and depth information using a dual-complexity backbone and self-supervised pretraining, improving performance and generalization.
Findings
Sets new SOTA on ScanNet dataset
Outperforms previous methods in zero-shot transfer
Maintains only 10% more computational overhead
Abstract
Plane instance segmentation from RGB-D data is a crucial research topic for many downstream tasks. However, most existing deep-learning-based methods utilize only information within the RGB bands, neglecting the important role of the depth band in plane instance segmentation. Based on EfficientSAM, a fast version of SAM, we propose a plane instance segmentation network called PlaneSAM, which can fully integrate the information of the RGB bands (spectral bands) and the D band (geometric band), thereby improving the effectiveness of plane instance segmentation in a multimodal manner. Specifically, we use a dual-complexity backbone, with primarily the simpler branch learning D-band features and primarily the more complex branch learning RGB-band features. Consequently, the backbone can effectively learn D-band feature representations even when D-band training data is limited in scale,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuality Function Deployment in Product Design · Big Data and Business Intelligence · Color perception and design
MethodsRegion Proposal Network · Softmax · RoIPool · Segment Anything Model · Convolution · Faster R-CNN
