Segment Anything in High Quality

Lei Ke; Mingqiao Ye; Martin Danelljan; Yifan Liu; Yu-Wing Tai,; Chi-Keung Tang; Fisher Yu

arXiv:2306.01567·cs.CV·October 24, 2023·110 cites

Segment Anything in High Quality

Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai,, Chi-Keung Tang, Fisher Yu

PDF

Open Access 4 Repos 10 Models

TL;DR

HQ-SAM enhances the original Segment Anything Model by significantly improving mask quality for intricate objects, using minimal additional training and preserving its zero-shot capabilities across diverse datasets.

Contribution

The paper introduces HQ-SAM, a high-quality segmentation extension that maintains SAM's promptability and efficiency while significantly improving mask detail through a learnable output token.

Findings

01

HQ-SAM outperforms SAM in mask quality on multiple datasets.

02

HQ-SAM achieves high-quality segmentation with minimal additional training.

03

The method maintains zero-shot transferability across diverse tasks.

Abstract

The recent Segment Anything Model (SAM) represents a big leap in scaling up segmentation models, allowing for powerful zero-shot capabilities and flexible prompting. Despite being trained with 1.1 billion masks, SAM's mask prediction quality falls short in many cases, particularly when dealing with objects that have intricate structures. We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability. Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask. Instead of only applying it on mask-decoder features, we first fuse them with early…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI

MethodsSegment Anything Model