Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation
Fangwen Wu, Jingxuan He, Yufei Yin, Yanbin Hao, Gang Huang, Lechao, Cheng

TL;DR
This paper proposes Masked Collaborative Contrast (MCC), a novel weakly supervised semantic segmentation method that leverages masked image modeling and contrastive learning to better highlight semantic regions and improve segmentation accuracy.
Contribution
The paper introduces MCC, a new framework that uses neighborhood relations and contrastive learning with masked local and global outputs for weakly supervised segmentation.
Findings
MCC effectively aligns global and local image features.
The method achieves state-of-the-art performance on standard datasets.
Source code is publicly available for reproducibility.
Abstract
This study introduces an efficacious approach, Masked Collaborative Contrast (MCC), to highlight semantic regions in weakly supervised semantic segmentation. MCC adroitly draws inspiration from masked image modeling and contrastive learning to devise a novel framework that induces keys to contract toward semantic regions. Unlike prevalent techniques that directly eradicate patch regions in the input image when generating masks, we scrutinize the neighborhood relations of patch tokens by exploring masks considering keys on the affinity matrix. Moreover, we generate positive and negative samples in contrastive learning by utilizing the masked local output and contrasting it with the global output. Elaborate experiments on commonly employed datasets evidences that the proposed MCC mechanism effectively aligns global and local perspectives within the image, attaining impressive performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Adam · Absolute Position Encodings · Softmax · Layer Normalization · Byte Pair Encoding
