Fine-grained Background Representation for Weakly Supervised Semantic Segmentation
Xu Yin, Woobin Im, Dongbo Min, Yuchi Huo, Fei Pan, Sung-Eui Yoon

TL;DR
This paper introduces a simple yet effective fine-grained background representation method for weakly supervised semantic segmentation, improving discrimination of background and foreground regions using contrastive learning and active sampling.
Contribution
The proposed method avoids using class prototypes, instead utilizing negative regions of interest and contrastive learning to enhance background and foreground discrimination in WSSS.
Findings
Achieves 73.2 mIoU on Pascal VOC with image-level labels.
Attains 45.6 mIoU on MS COCO with image-level labels.
Demonstrates strong performance in weakly-supervised instance segmentation.
Abstract
Generating reliable pseudo masks from image-level labels is challenging in the weakly supervised semantic segmentation (WSSS) task due to the lack of spatial information. Prevalent class activation map (CAM)-based solutions are challenged to discriminate the foreground (FG) objects from the suspicious background (BG) pixels (a.k.a. co-occurring) and learn the integral object regions. This paper proposes a simple fine-grained background representation (FBR) method to discover and represent diverse BG semantics and address the co-occurring problems. We abandon using the class prototype or pixel-level features for BG representation. Instead, we develop a novel primitive, negative region of interest (NROI), to capture the fine-grained BG semantic information and conduct the pixel-to-NROI contrast to distinguish the confusing BG pixels. We also present an active sampling strategy to mine the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques
MethodsContrastive Learning
