Segment Anything in High Quality
Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai,, Chi-Keung Tang, Fisher Yu

TL;DR
HQ-SAM enhances the original Segment Anything Model by significantly improving mask quality for intricate objects, using minimal additional training and preserving its zero-shot capabilities across diverse datasets.
Contribution
The paper introduces HQ-SAM, a high-quality segmentation extension that maintains SAM's promptability and efficiency while significantly improving mask detail through a learnable output token.
Findings
HQ-SAM outperforms SAM in mask quality on multiple datasets.
HQ-SAM achieves high-quality segmentation with minimal additional training.
The method maintains zero-shot transferability across diverse tasks.
Abstract
The recent Segment Anything Model (SAM) represents a big leap in scaling up segmentation models, allowing for powerful zero-shot capabilities and flexible prompting. Despite being trained with 1.1 billion masks, SAM's mask prediction quality falls short in many cases, particularly when dealing with objects that have intricate structures. We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability. Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask. Instead of only applying it on mask-decoder features, we first fuse them with early…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Uminosachi/sam-hqmodel· ♡ 2♡ 2
- 🤗ductai199x/sam_hq_vit_hugemodel· 3 dl3 dl
- 🤗ductai199x/sam_hq_vit_largemodel· 9 dl· ♡ 29 dl♡ 2
- 🤗syscv-community/sam-hq-vit-basemodel· 5.0k dl· ♡ 125.0k dl♡ 12
- 🤗syscv-community/sam-hq-vit-largemodel· 95 dl· ♡ 295 dl♡ 2
- 🤗syscv-community/sam-hq-vit-hugemodel· 95 dl· ♡ 695 dl♡ 6
- 🤗wanziteng/sd-webui-inpaint-anything-1.17.0model
- 🤗wanziteng/sd-webui-inpaint-anything-1.16.2model
- 🤗wanziteng/sd-webui-inpaint-anything-1.16.0model
- 🤗wanziteng/sd-webui-inpaint-anything-1.15.1model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
MethodsSegment Anything Model
