Zero-Shot Co-salient Object Detection Framework
Haoke Xiao, Lv Tang, Bo Li, Zhiming Luo, Shaozi Li

TL;DR
This paper presents the first zero-shot co-salient object detection framework that leverages foundational vision models without training, introducing novel modules and outperforming many existing methods.
Contribution
It introduces a training-free zero-shot CoSOD framework with novel group prompt and co-saliency map modules, outperforming prior unsupervised and some supervised methods.
Findings
Outperforms existing unsupervised CoSOD methods.
Surpasses some fully supervised methods before 2020.
Remains competitive with recent supervised approaches.
Abstract
Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The exploration of training-free zero-shot CoSOD frameworks has been limited. In this paper, taking inspiration from the zero-shot transfer capabilities of foundational computer vision models, we introduce the first zero-shot CoSOD framework that harnesses these models without any training process. To achieve this, we introduce two novel components in our proposed framework: the group prompt generation (GPG) module and the co-saliency map generation (CMP) module. We evaluate the framework's performance on widely-used datasets and observe impressive results. Our approach surpasses existing unsupervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
