Rethinking Atrous Convolution for Semantic Image Segmentation
Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam

TL;DR
This paper revisits atrous convolution for semantic segmentation, proposing multi-scale modules and an enhanced spatial pyramid pooling to improve accuracy without post-processing, achieving state-of-the-art results.
Contribution
It introduces cascaded and parallel atrous convolution modules and an augmented spatial pyramid pooling for better multi-scale feature extraction.
Findings
Significant performance improvement over previous DeepLab models.
Achieves competitive results on PASCAL VOC 2012 without DenseCRF.
Demonstrates effectiveness of multi-scale atrous convolution modules.
Abstract
In this work, we revisit atrous convolution, a powerful tool to explicitly adjust filter's field-of-view as well as control the resolution of feature responses computed by Deep Convolutional Neural Networks, in the application of semantic image segmentation. To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. Furthermore, we propose to augment our previously proposed Atrous Spatial Pyramid Pooling module, which probes convolutional features at multiple scales, with image-level features encoding global context and further boost performance. We also elaborate on implementation details and share our experience on training our system. The proposed `DeepLabv3' system significantly improves over our previous DeepLab versions without DenseCRF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗keras-io/deeplabv3p-resnet50model· 64 dl· ♡ 564 dl♡ 5
- 🤗Matthijs/deeplabv3-mobilevit-smallmodel· 7 dl· ♡ 17 dl♡ 1
- 🤗apple/deeplabv3-mobilevit-smallmodel· 1.1k dl· ♡ 181.1k dl♡ 18
- 🤗apple/deeplabv3-mobilevit-x-smallmodel· 158 dl· ♡ 3158 dl♡ 3
- 🤗apple/deeplabv3-mobilevit-xx-smallmodel· 1.4k dl· ♡ 101.4k dl♡ 10
- 🤗shehan97/mobilevitv2-1.0-voc-deeplabv3model· 569 dl569 dl
- 🤗apple/mobilevitv2-1.0-voc-deeplabv3model· 86 dl· ♡ 286 dl♡ 2
- 🤗SpotLab/MobileViT_DeepLabv3model· 144 dl144 dl
- 🤗qualcomm/DeepLabV3-ResNet50model· 158 dl158 dl
- 🤗qualcomm/DeepLabV3-Plus-MobileNetmodel· 665 dl· ♡ 1665 dl♡ 1
Videos
Semantic Segmentation in PyTorch | Neural Style Transfer #7· youtube
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
Methods23 Smart Tips to Contact Spirit Airlines® – How Do I Talk to a Human Fast (2025 Guide) · 5 Ways To Communicate To Someone At Copa Airlines · how to speak directly on robinhood? · Spatial Pyramid Pooling · Average Pooling · Atrous Spatial Pyramid Pooling · SGD with Momentum · Weight Decay · Random Horizontal Flip · Random Scaling
