ReCo: Retrieve and Co-segment for Zero-shot Transfer
Gyungin Shin, Weidi Xie, Samuel Albanie

TL;DR
ReCo combines language-image pre-training and retrieval to enable zero-shot, nameable segmentation without pixel labels, effectively segmenting rare objects and reducing annotation costs.
Contribution
It introduces a novel method that synthesizes retrieval and co-segmentation to achieve zero-shot, nameable segmentation leveraging CLIP's capabilities.
Findings
ReCo outperforms existing unsupervised segmentation methods.
It enables zero-shot transfer with nameable predictions.
ReCo can generate segmenters for rare objects.
Abstract
Semantic segmentation has a broad range of applications, but its real-world impact has been significantly limited by the prohibitive annotation costs necessary to enable deployment. Segmentation methods that forgo supervision can side-step these costs, but exhibit the inconvenient requirement to provide labelled examples from the target distribution to assign concept names to predictions. An alternative line of work in language-image pre-training has recently demonstrated the potential to produce models that can both assign names across large vocabularies of concepts and enable zero-shot transfer for classification, but do not demonstrate commensurate segmentation abilities. In this work, we strive to achieve a synthesis of these two approaches that combines their strengths. We leverage the retrieval abilities of one such language-image pre-trained model, CLIP, to dynamically curate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsContrastive Language-Image Pre-training
