Attention to Scale: Scale-aware Semantic Image Segmentation
Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, Alan L. Yuille

TL;DR
This paper introduces an attention mechanism for multi-scale feature fusion in semantic image segmentation, improving performance and interpretability over traditional pooling methods across multiple challenging datasets.
Contribution
It proposes a novel attention model that adaptively weights multi-scale features at each pixel, enhancing segmentation accuracy and interpretability in deep neural networks.
Findings
Outperforms average- and max-pooling methods
Visualizes importance of features at different scales
Achieves state-of-the-art results on multiple datasets
Abstract
Incorporating multi-scale features in fully convolutional neural networks (FCNs) has been a key element to achieving state-of-the-art performance on semantic image segmentation. One common way to extract multi-scale features is to feed multiple resized input images to a shared deep network and then merge the resulting features for pixelwise classification. In this work, we propose an attention mechanism that learns to softly weight the multi-scale features at each pixel location. We adapt a state-of-the-art semantic image segmentation model, which we jointly train with multi-scale input images and the attention model. The proposed attention model not only outperforms average- and max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales. Moreover, we show that adding extra supervision to the output at each scale is essential to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
