Spatial Semantic Recurrent Mining for Referring Image Segmentation

Jiaxing Yang; Lihe Zhang; Jiayu Sun; Huchuan Lu

arXiv:2405.09006·cs.CV·May 16, 2024

Spatial Semantic Recurrent Mining for Referring Image Segmentation

Jiaxing Yang, Lihe Zhang, Jiayu Sun, Huchuan Lu

PDF

Open Access

TL;DR

This paper introduces Spatial Semantic Recurrent Mining (S²RM), a novel method for Referring Image Segmentation that enhances cross-modality fusion by recurrently correlating semantic features across spatial and contextual dimensions.

Contribution

The paper proposes S²RM, a new spatial semantic recurrent framework, and a Cross-scale Abstract Semantic Guided Decoder (CASG) for improved referent segmentation accuracy.

Findings

01

Outperforms state-of-the-art on four challenging datasets.

02

Effectively models global relationships and structured semantics.

03

Enhances cross-modality feature fusion in RIS.

Abstract

Referring Image Segmentation (RIS) consistently requires language and appearance semantics to more understand each other. The need becomes acute especially under hard situations. To achieve, existing works tend to resort to various trans-representing mechanisms to directly feed forward language semantic along main RGB branch, which however will result in referent distribution weakly-mined in space and non-referent semantic contaminated along channel. In this paper, we propose Spatial Semantic Recurrent Mining (S\textsuperscript{2}RM) to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. During fusion, S\textsuperscript{2}RM will first generate a constraint-weak yet distribution-aware language feature, then bundle features of each row and column from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Semantic Web and Ontologies