Generating Self-Guided Dense Annotations for Weakly Supervised Semantic Segmentation
Zi-Yi Ke, Chiou-Ting Hsu

TL;DR
This paper introduces a self-guided approach that leverages multi-level features within a single neural network to generate dense pseudo labels for weakly supervised semantic segmentation, eliminating the need for external models.
Contribution
It proposes a novel unified framework that progressively refines pseudo labels using self-guided strategies within one network, improving segmentation accuracy under image-level supervision.
Findings
Outperforms existing weakly-supervised methods on PASCAL VOC 2012
Effectively generates dense pseudo labels without external models
Demonstrates improved segmentation performance with the proposed self-guided approach
Abstract
Learning semantic segmentation models under image-level supervision is far more challenging than under fully supervised setting. Without knowing the exact pixel-label correspondence, most weakly-supervised methods rely on external models to infer pseudo pixel-level labels for training semantic segmentation models. In this paper, we aim to develop a single neural network without resorting to any external models. We propose a novel self-guided strategy to fully utilize features learned across multiple levels to progressively generate the dense pseudo labels. First, we use high-level features as class-specific localization maps to roughly locate the classes. Next, we propose an affinity-guided method to encourage each localization map to be consistent with their intermediate level features. Third, we adopt the training image itself as guidance and propose a self-guided refinement to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
