SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He; Henghui Ding; Xudong Jiang; Bihan Wen

arXiv:2407.13761·cs.CV·July 19, 2024

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang, Bihan Wen

PDF

Open Access

TL;DR

SegPoint leverages large language models to perform diverse 3D point cloud segmentation tasks, including implicit instruction understanding, within a unified framework, and introduces a new benchmark for evaluating such capabilities.

Contribution

This work introduces SegPoint, the first unified model capable of handling multiple 3D segmentation tasks using LLM reasoning, and presents Instruct3D, a new benchmark for implicit instruction-based segmentation.

Findings

01

Achieves competitive results on ScanRefer and ScanNet benchmarks.

02

Outperforms existing methods on the Instruct3D dataset.

03

Demonstrates the ability to understand complex implicit instructions.

Abstract

Despite significant progress in 3D point cloud segmentation, existing methods primarily address specific tasks and depend on explicit instructions to identify targets, lacking the capability to infer and understand implicit user intentions in a unified framework. In this work, we propose a model, called SegPoint, that leverages the reasoning capabilities of a multi-modal Large Language Model (LLM) to produce point-wise segmentation masks across a diverse range of tasks: 1) 3D instruction segmentation, 2) 3D referring segmentation, 3) 3D semantic segmentation, and 4) 3D open-vocabulary semantic segmentation. To advance 3D instruction research, we introduce a new benchmark, Instruct3D, designed to evaluate segmentation performance from complex and implicit instructional texts, featuring 2,565 point cloud-instruction pairs. Our experimental results demonstrate that SegPoint achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies