TL;DR
EASe introduces a novel unsupervised segmentation framework that leverages feature calibration and self-supervised upsampling to achieve fine-grained mask discovery in complex scenes, outperforming existing methods.
Contribution
The paper presents EASe, a domain-agnostic unsupervised segmentation method with novel SAUCE and CAFE modules for fine-grained mask discovery at pixel level.
Findings
EASe outperforms previous SOTAs on standard benchmarks.
SAUCE effectively excites low-resolution feature channels for better detail.
CAFE leverages attention scores for semantic grouping without training.
Abstract
Unsupervised segmentation approaches have increasingly leveraged foundation models (FM) to improve salient object discovery. However, these methods often falter in scenes with complex, multi-component morphologies, where fine-grained structural detail is indispensable. Many state-of-the-art unsupervised segmentation pipelines rely on mask discovery approaches that utilize coarse, patch-level representations. These coarse representations inherently suppress the fine-grained detail required to resolve such complex morphologies. To overcome this limitation, we propose Excite, Attend and Segment (EASe), an unsupervised domain-agnostic semantic segmentation framework for easy fine-grained mask discovery across challenging real-world scenes. EASe utilizes novel Semantic-Aware Upsampling with Channel Excitation (SAUCE) to excite low-resolution FM feature channels for selective calibration and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
