Weakly-supervised Semantic Segmentation via Dual-stream Contrastive   Learning of Cross-image Contextual Information

Qi Lai; Chi-Man Vong

arXiv:2405.04913·cs.CV·May 9, 2024

Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information

Qi Lai, Chi-Man Vong

PDF

Open Access

TL;DR

This paper introduces DSCNet, a novel weakly-supervised semantic segmentation framework that leverages dual-stream contrastive learning to utilize both pixel-wise and semantic-wise inter-image information, significantly improving performance.

Contribution

The paper proposes a new end-to-end WSSS framework with pixel-wise group contrast and semantic-wise graph contrast, integrating dual-stream contrastive learning for enhanced segmentation accuracy.

Findings

01

Outperforms state-of-the-art methods on PASCAL VOC and MS COCO datasets.

02

Demonstrates the effectiveness of dual-stream contrastive learning in WSSS.

03

Achieves significant performance gains over baseline models.

Abstract

Weakly supervised semantic segmentation (WSSS) aims at learning a semantic segmentation model with only image-level tags. Despite intensive research on deep learning approaches over a decade, there is still a significant performance gap between WSSS and full semantic segmentation. Most current WSSS methods always focus on a limited single image (pixel-wise) information while ignoring the valuable inter-image (semantic-wise) information. From this perspective, a novel end-to-end WSSS framework called DSCNet is developed along with two innovations: i) pixel-wise group contrast and semantic-wise graph contrast are proposed and introduced into the WSSS framework; ii) a novel dual-stream contrastive learning (DSCL) mechanism is designed to jointly handle pixel-wise and semantic-wise context information for better WSSS performance. Specifically, the pixel-wise group contrast learning (PGCL)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Face and Expression Recognition

MethodsContrastive Learning · Focus