UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

Junwei Yu; Trevor Darrell; XuDong Wang

arXiv:2511.13714·cs.CV·November 18, 2025

UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

Junwei Yu, Trevor Darrell, XuDong Wang

PDF

Open Access 1 Models

TL;DR

UnSAMv2 introduces a self-supervised approach that enables the Segment Anything Model to control segmentation granularity precisely at any scale without human annotations, significantly improving performance across various tasks.

Contribution

It proposes a novel granularity control embedding and a self-supervised learning method that unlocks multi-scale segmentation capabilities in SAM without requiring dense annotations.

Findings

01

Achieves improved segmentation metrics across 11 benchmarks.

02

Enables continuous control over segmentation scale.

03

Uses only 6K unlabeled images with minimal additional parameters.

Abstract

The Segment Anything Model (SAM) family has become a widely adopted vision foundation model, but its ability to control segmentation granularity remains limited. Users often need to refine results manually - by adding more prompts or selecting from pre-generated masks - to achieve the desired level of detail. This process can be ambiguous, as the same prompt may correspond to several plausible masks, and collecting dense annotations across all granularities is prohibitively expensive, making supervised solutions infeasible. To address this limitation, we introduce UnSAMv2, which enables segment anything at any granularity without human annotations. UnSAMv2 extends the divide-and-conquer strategy of UnSAM by discovering abundant mask-granularity pairs and introducing a novel granularity control embedding that enables precise, continuous control over segmentation scale. Remarkably, with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
yujunwei04/unsam-whole-image-segmentation
model· 108 dl· ♡ 3
108 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection