Understanding Convolution for Semantic Segmentation
Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou,, Garrison Cottrell

TL;DR
This paper introduces novel convolutional techniques, DUC and HDC, to enhance pixel-level semantic segmentation, achieving state-of-the-art results on multiple benchmarks by capturing detailed information and enlarging receptive fields.
Contribution
The paper proposes dense upsampling convolution (DUC) and hybrid dilated convolution (HDC) frameworks to improve segmentation accuracy and address issues in existing convolutional methods.
Findings
Achieved 80.1% mIOU on Cityscapes dataset.
State-of-the-art results on KITTI road estimation.
Best performance on PASCAL VOC2012 segmentation.
Abstract
Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsAverage Pooling · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling
