Squeeze-and-Attention Networks for Semantic Segmentation
Zilong Zhong, Zhong Qiu Lin, Rene Bidart, Xiaodan Hu, Ibrahim Ben, Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, Alexander Wong

TL;DR
This paper introduces SANet, a novel architecture with squeeze-and-attention modules that enhance semantic segmentation by focusing on pixel-group and pixel-wise features, achieving state-of-the-art results.
Contribution
The paper proposes a new squeeze-and-attention module that improves segmentation by capturing spatial-channel dependencies and multi-scale context.
Findings
Achieves 83.2% mIoU on PASCAL VOC without COCO pre-training.
Achieves 54.4% mIoU on PASCAL Context, setting a new state-of-the-art.
Effectively models pixel-group and pixel-wise attention for better segmentation.
Abstract
The recent integration of attention mechanisms into segmentation networks improves their representational capabilities through a great emphasis on more informative features. However, these attention mechanisms ignore an implicit sub-task of semantic segmentation and are constrained by the grid structure of convolution kernels. In this paper, we propose a novel squeeze-and-attention network (SANet) architecture that leverages an effective squeeze-and-attention (SA) module to account for two distinctive characteristics of segmentation: i) pixel-group attention, and ii) pixel-wise prediction. Specifically, the proposed SA modules impose pixel-group attention on conventional convolution by introducing an 'attention' convolutional channel, thus taking into account spatial-channel inter-dependencies in an efficient manner. The final segmentation results are produced by merging outputs from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Squeeze-and-Attention Networks for Semantic Segmentation· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsConvolution
