SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, Zhengning Liu, Ming-Ming Cheng,, Shi-Min Hu

TL;DR
SegNeXt introduces a convolutional attention-based architecture that outperforms transformer-based models in semantic segmentation tasks, achieving higher accuracy with fewer parameters and computational resources.
Contribution
The paper proposes a novel convolutional attention mechanism for semantic segmentation, demonstrating its efficiency and effectiveness over transformer-based approaches.
Findings
SegNeXt outperforms previous state-of-the-art methods on multiple benchmarks.
Achieves 90.6% mIoU on Pascal VOC 2012 with only 1/10 parameters of leading models.
Improves mIoU by about 2.0% on ADE20K datasets with similar or less computation.
Abstract
We present SegNeXt, a simple convolutional network architecture for semantic segmentation. Recent transformer-based models have dominated the field of semantic segmentation due to the efficiency of self-attention in encoding spatial information. In this paper, we show that convolutional attention is a more efficient and effective way to encode contextual information than the self-attention mechanism in transformers. By re-examining the characteristics owned by successful segmentation models, we discover several key components leading to the performance improvement of segmentation models. This motivates us to design a novel convolutional attention network that uses cheap convolutional operations. Without bells and whistles, our SegNeXt significantly improves the performance of previous state-of-the-art methods on popular benchmarks, including ADE20K, Cityscapes, COCO-Stuff, Pascal VOC,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
MethodsTest · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Global Average Pooling · Batch Normalization · NAS-FPN
