CMF: Cascaded Multi-model Fusion for Referring Image Segmentation
Jianhua Yang, Yan Huang, Zhanyu Ma, Liang Wang

TL;DR
This paper introduces a Cascaded Multi-modal Fusion (CMF) module that enhances referring image segmentation by effectively integrating multi-scale contextual information through cascaded atrous convolutions, leading to improved performance on benchmark datasets.
Contribution
The paper proposes a novel CMF module with cascaded branches for better multi-scale context modeling in referring image segmentation.
Findings
Outperforms most state-of-the-art methods on four benchmark datasets.
Effectively integrates multi-scale contextual information.
Demonstrates the importance of cascaded fusion in RIS tasks.
Abstract
In this work, we address the task of referring image segmentation (RIS), which aims at predicting a segmentation mask for the object described by a natural language expression. Most existing methods focus on establishing unidirectional or directional relationships between visual and linguistic features to associate two modalities together, while the multi-scale context is ignored or insufficiently modeled. Multi-scale context is crucial to localize and segment those objects that have large scale variations during the multi-modal fusion process. To solve this problem, we propose a simple yet effective Cascaded Multi-modal Fusion (CMF) module, which stacks multiple atrous convolutional layers in parallel and further introduces a cascaded branch to fuse visual and linguistic features. The cascaded branch can progressively integrate multi-scale contextual information and facilitate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
