Tell Me Where to Look: Guided Attention Inference Network
Kunpeng Li, Ziyan Wu, Kuan-Chuan Peng, Jan Ernst, Yun Fu

TL;DR
This paper introduces a novel guided attention inference network that explicitly incorporates attention maps into training, improving weakly supervised learning for tasks like semantic segmentation and surpassing state-of-the-art results.
Contribution
It makes attention maps an explicit part of end-to-end training, uses self-guidance to enhance these maps, and bridges weak and extra supervision seamlessly.
Findings
Surpasses state-of-the-art on Pascal VOC 2012 for semantic segmentation.
Effectively improves attention maps through self-guidance.
Provides a plug-in framework to enhance existing weakly supervised methods.
Abstract
Weakly supervised learning with only coarse labels can obtain visual explanations of deep neural network such as attention maps by back-propagating gradients. These attention maps are then available as priors for tasks such as object localization and semantic segmentation. In one common framework we address three shortcomings of previous approaches in modeling such attention maps: We (1) first time make attention maps an explicit and natural component of the end-to-end training, (2) provide self-guidance directly on these maps by exploring supervision form the network itself to improve them, and (3) seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. We clearly surpass the state-of-the-art on Pascal VOC 2012 val. and test set. Besides, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
