SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models

Joon Hyun Park; Kumju Jo; Sungyong Baik

arXiv:2507.19808·cs.CV·September 16, 2025

SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models

Joon Hyun Park, Kumju Jo, Sungyong Baik

PDF

1 Video

TL;DR

SeeDiff leverages attention mechanisms in Stable Diffusion to generate high-quality semantic segmentation masks without extra training, prompt tuning, or pre-trained models, by extracting initial seeds and expanding them through self-attention.

Contribution

This work introduces SeeDiff, a novel method that exploits attention in diffusion models for off-the-shelf semantic mask generation without additional training.

Findings

01

Achieves high-quality masks without training or prompt tuning.

02

Utilizes cross-attention for initial object seeds.

03

Expands seeds using multi-scale self-attention for full object coverage.

Abstract

Entrusted with the goal of pixel-level object classification, the semantic segmentation networks entail the laborious preparation of pixel-level annotation masks. To obtain pixel-level annotation masks for a given class without human efforts, recent few works have proposed to generate pairs of images and annotation masks by employing image and text relationships modeled by text-to-image generative models, especially Stable Diffusion. However, these works do not fully exploit the capability of text-guided Diffusion models and thus require a pre-trained segmentation network, careful text prompt tuning, or the training of a segmentation network to generate final annotation masks. In this work, we take a closer look at attention mechanisms of Stable Diffusion, from which we draw connections with classical seeded segmentation approaches. In particular, we show that cross-attention alone…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models· underline