MMNet: Multi-Mask Network for Referring Image Segmentation

Yichen Yan; Xingjian He; Wenxuan Wan; Jing Liu

arXiv:2305.14969·cs.CV·May 25, 2023·5 cites

MMNet: Multi-Mask Network for Referring Image Segmentation

Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu

PDF

Open Access

TL;DR

This paper introduces MMNet, an end-to-end multi-mask network that generates multiple segmentation masks from natural language expressions, effectively handling language and object diversity for improved referring image segmentation.

Contribution

The paper proposes a novel multi-mask network that produces multiple segmentation masks and combines them to address uncertainty in referring image segmentation tasks.

Findings

01

Outperforms state-of-the-art on RefCOCO, RefCOCO+, and G-Ref datasets

02

Eliminates the need for post-processing in segmentation

03

Effectively reduces language-induced randomness

Abstract

Referring image segmentation aims to segment an object referred to by natural language expression from an image. However, this task is challenging due to the distinct data properties between text and image, and the randomness introduced by diverse objects and unrestricted language expression. Most of previous work focus on improving cross-modal feature fusion while not fully addressing the inherent uncertainty caused by diverse objects and unrestricted language. To tackle these problems, we propose an end-to-end Multi-Mask Network for referring image segmentation(MMNet). we first combine picture and language and then employ an attention mechanism to generate multiple queries that represent different aspects of the language expression. We then utilize these queries to produce a series of corresponding segmentation masks, assigning a score to each mask that reflects its importance. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling

MethodsFocus