MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic   Segmentation

Zhenchao Jin; Dongdong Yu; Zehuan Yuan; Lequan Yu

arXiv:2209.04471·cs.CV·September 13, 2022

MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation

Zhenchao Jin, Dongdong Yu, Zehuan Yuan, Lequan Yu

PDF

Open Access 2 Repos

TL;DR

MCIBI++ introduces a novel approach for semantic segmentation that leverages dataset-level category representations beyond individual images, significantly enhancing pixel-level accuracy and achieving state-of-the-art results.

Contribution

The paper proposes MCIBI++, a new framework that incorporates dataset-level contextual information through a dynamic memory module and an iterative inference strategy, improving segmentation performance.

Findings

01

Achieved state-of-the-art results on seven benchmarks.

02

Enhanced segmentation accuracy with dataset-level context aggregation.

03

Effective extension to video semantic segmentation.

Abstract

Co-occurrent visual pattern makes context aggregation become an essential paradigm for semantic segmentation.The existing studies focus on modeling the contexts within image while neglecting the valuable semantics of the corresponding category beyond image. To this end, we propose a novel soft mining contextual information beyond image paradigm named MCIBI++ to further boost the pixel-level representations. Specifically, we first set up a dynamically updated memory module to store the dataset-level distribution information of various categories and then leverage the information to yield the dataset-level category representations during network forward. After that, we generate a class probability distribution for each pixel representation and conduct the dataset-level context aggregation with the class probability distribution as weights. Finally, the original pixel representations are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications