RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Ying Zang; Chenglong Fu; Runlong Cao; Didi Zhu; Min Zhang; Wenjun Hu,; Lanyun Zhu; Tianrun Chen

arXiv:2402.05589·cs.CV·February 13, 2024·1 cites

RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Ying Zang, Chenglong Fu, Runlong Cao, Didi Zhu, Min Zhang, Wenjun Hu,, Lanyun Zhu, Tianrun Chen

PDF

Open Access

TL;DR

RESMatch is a novel semi-supervised learning method for referring expression segmentation that reduces the need for extensive annotated data while achieving state-of-the-art performance by adapting SSL techniques to handle linguistic and visual complexities.

Contribution

It introduces RESMatch, the first semi-supervised approach for RES, with specific adaptations for text and image perturbations, improving performance over baselines.

Findings

01

RESMatch outperforms baseline methods on multiple datasets.

02

It establishes a new state-of-the-art in RES.

03

The approach effectively reduces reliance on annotated data.

Abstract

Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate understanding of both visual and textual contexts and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) approach for RES, aimed at reducing reliance on exhaustive data annotation. Extensive validation on multiple RES datasets demonstrates that RESMatch significantly outperforms baseline approaches, establishing a new state-of-the-art. Although existing SSL techniques are effective in image segmentation, we find that they fall short in RES. Facing the challenges including the comprehension of free-form linguistic descriptions and the variability in object attributes, RESMatch introduces a trifecta…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Speech Recognition and Synthesis