SAI3D: Segment Any Instance in 3D Scenes

Yingda Yin; Yuzheng Liu; Yang Xiao; Daniel Cohen-Or; Jingwei Huang,; Baoquan Chen

arXiv:2312.11557·cs.CV·March 26, 2024·1 cites

SAI3D: Segment Any Instance in 3D Scenes

Yingda Yin, Yuzheng Liu, Yang Xiao, Daniel Cohen-Or, Jingwei Huang,, Baoquan Chen

PDF

Open Access

TL;DR

SAI3D introduces a zero-shot 3D instance segmentation method that combines geometric primitives and semantic cues from SAM, outperforming existing methods on multiple datasets without requiring annotated training data.

Contribution

It presents a novel hierarchical region-growing algorithm that integrates geometric and semantic information for robust 3D scene parsing in a zero-shot setting.

Findings

01

Outperforms existing open-vocabulary baselines

02

Surpasses fully-supervised methods in class-agnostic segmentation

03

Demonstrates effectiveness on ScanNet, Matterport3D, and ScanNet++ datasets

Abstract

Advancements in 3D instance segmentation have traditionally been tethered to the availability of annotated datasets, limiting their application to a narrow spectrum of object categories. Recent efforts have sought to harness vision-language models like CLIP for open-set semantic reasoning, yet these methods struggle to distinguish between objects of the same categories and rely on specific prompts that are not universally applicable. In this paper, we introduce SAI3D, a novel zero-shot 3D instance segmentation approach that synergistically leverages geometric priors and semantic cues derived from Segment Anything Model (SAM). Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations that are consistent with the multi-view SAM masks. Moreover, we design a hierarchical region-growing algorithm with a dynamic thresholding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Image Processing and 3D Reconstruction · 3D Shape Modeling and Analysis

MethodsContrastive Language-Image Pre-training · Segment Anything Model