Unsupervised Segmentation by Diffusing, Walking and Cutting
Daniela Ivanova, Marco Aversa, Paul Henderson, John Williamson

TL;DR
This paper introduces an unsupervised image segmentation method leveraging self-attention from pre-trained diffusion models, using spectral clustering and random walk interpretation to achieve hierarchical segmentation without additional training.
Contribution
It presents a novel zero-shot segmentation approach based on self-attention and spectral clustering, surpassing existing methods on benchmark datasets.
Findings
Achieves state-of-the-art zero-shot segmentation results on COCO-Stuff-27 and Cityscapes.
Demonstrates the effectiveness of self-attention-based adjacency matrices for semantic segmentation.
Provides insights into the impact of different features and thresholds on segmentation quality.
Abstract
We propose an unsupervised image segmentation method using features from pre-trained text-to-image diffusion models. Inspired by classic spectral clustering approaches, we construct adjacency matrices from self-attention layers between image patches and recursively partition using Normalised Cuts. A key insight is that self-attention probability distributions, which capture semantic relations between patches, can be interpreted as a transition matrix for random walks across the image. We leverage this by first using Random Walk Normalized Cuts directly on these self-attention activations to partition the image, minimizing transition probabilities between clusters while maximizing coherence within clusters. Applied recursively, this yields a hierarchical segmentation that reflects the rich semantics in the pre-trained attention layers, without any additional training. Next, we explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · Machine Learning and Data Classification · Industrial Vision Systems and Defect Detection
MethodsSoftmax · Attention Is All You Need · Diffusion · Spectral Clustering
