Fast Segment Anything

Xu Zhao; Wenchao Ding; Yongqi An; Yinglong Du; Tao Yu; Min Li; Ming; Tang; Jinqiao Wang

arXiv:2306.12156·cs.CV·June 22, 2023·33 cites

Fast Segment Anything

Xu Zhao, Wenchao Ding, Yongqi An, Yinglong Du, Tao Yu, Min Li, Ming, Tang, Jinqiao Wang

PDF

Open Access 1 Repo 9 Models

TL;DR

This paper introduces a fast, CNN-based method for image segmentation that achieves comparable accuracy to the large, computationally intensive SAM model but runs 50 times faster, making it more practical for industry use.

Contribution

It reformulates segmentation as an instance segmentation task and trains a CNN detector, significantly reducing computation while maintaining performance.

Findings

01

Achieves similar accuracy to SAM with 50x faster speed.

02

Uses only 1/50 of the SAM dataset for training.

03

Demonstrates effectiveness through extensive experiments.

Abstract

The recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks. It is becoming a foundation step for many high-level tasks, like image segmentation, image caption, and image editing. However, its huge computation costs prevent it from wider applications in industry scenarios. The computation mainly comes from the Transformer architecture at high-resolution inputs. In this paper, we propose a speed-up alternative method for this fundamental task with comparable performance. By reformulating the task as segments-generation and prompting, we find that a regular CNN detector with an instance segmentation branch can also accomplish this task well. Specifically, we convert this task to the well-studied instance segmentation task and directly train the existing instance segmentation method using only 1/50 of the SA-1B dataset published by SAM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

casia-iva-lab/fastsam
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications

MethodsMulti-Head Attention · Attention Is All You Need · Segment Anything Model · Linear Layer · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Layer Normalization · Adam · Residual Connection