SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

Dingyuan Zhang; Dingkang Liang; Hongcheng Yang; Zhikang Zou; Xiaoqing; Ye; Zhe Liu; Xiang Bai

arXiv:2306.02245·cs.CV·January 30, 2024·2 cites

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing, Ye, Zhe Liu, Xiang Bai

PDF

Open Access 1 Repo

TL;DR

This paper explores adapting the Segment Anything Model's zero-shot segmentation capabilities to 3D object detection, demonstrating promising results on the Waymo dataset and opening new avenues for foundation models in 3D vision tasks.

Contribution

It introduces a novel SAM-powered BEV processing pipeline for 3D object detection, pioneering the use of vision foundation models in 3D vision tasks.

Findings

01

Promising detection results on the Waymo dataset

02

First attempt at applying SAM to 3D object detection

03

Open-source code available for further research

Abstract

With the development of large language models, many remarkable linguistic systems like ChatGPT have thrived and achieved astonishing success on many tasks, showing the incredible power of foundation models. In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be explored, especially 3D object detection. With this inspiration, we explore adapting the zero-shot ability of SAM to 3D object detection in this paper. We propose a SAM-powered BEV processing pipeline to detect objects and get promising results on the large-scale Waymo open dataset. As an early attempt, our method takes a step toward 3D object detection with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dyzhang09/sam3d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Neural Network Applications

MethodsSegment Anything Model