Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation
Seonghoon Yu, Paul Hongsuck Seo, Jeany Son

TL;DR
This paper introduces a framework that automatically generates high-quality, distinctive pseudo-supervisions for referring image segmentation by leveraging foundation models and novel captioning strategies, reducing manual labeling costs.
Contribution
It presents a new pseudo-supervision generation method using distinctive caption sampling and filtering, enabling effective training of RIS models without manual annotations.
Findings
Outperforms state-of-the-art weakly and zero-shot methods on RIS benchmarks.
Surpasses fully supervised methods in unseen domains.
Enhances semi-supervised learning when combined with human annotations.
Abstract
We propose a new framework that automatically generates high-quality segmentation masks with their referring expressions as pseudo supervisions for referring image segmentation (RIS). These pseudo supervisions allow the training of any supervised RIS methods without the cost of manual labeling. To achieve this, we incorporate existing segmentation and image captioning foundation models, leveraging their broad generalization capabilities. However, the naive incorporation of these models may generate non-distinctive expressions that do not distinctively refer to the target masks. To address this challenge, we propose two-fold strategies that generate distinctive captions: 1) 'distinctive caption sampling', a new decoding method for the captioning model, to generate multiple expression candidates with detailed words focusing on the target. 2) 'distinctiveness-based text filtering' to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Brain Tumor Detection and Classification
