TL;DR
This paper introduces a novel adversarial erasing method for weakly-supervised semantic segmentation that improves attention map quality without saliency masks, leading to better segmentation accuracy.
Contribution
It proposes a new adversarial training framework with two opposing networks that enhances attention maps for weakly-supervised segmentation without complex strategies.
Findings
Increases segmentation mIoU by 2.1 over baseline
Outperforms previous adversarial erasing methods by 1.0 mIoU
Does not require saliency masks for attention regularization
Abstract
Semantic segmentation is a task that traditionally requires a large dataset of pixel-level ground truth labels, which is time-consuming and expensive to obtain. Recent advancements in the weakly-supervised setting show that reasonable performance can be obtained by using only image-level labels. Classification is often used as a proxy task to train a deep neural network from which attention maps are extracted. However, the classification task needs only the minimum evidence to make predictions, hence it focuses on the most discriminative object regions. To overcome this problem, we propose a novel formulation of adversarial erasing of the attention maps. In contrast to previous adversarial erasing methods, we optimize two networks with opposing loss functions, which eliminates the requirement of certain suboptimal strategies; for instance, having multiple training steps that complicate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
