DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

Hanchao Li; Pengfei Xiong; Haoqiang Fan; Jian Sun

arXiv:1904.02216·cs.CV·April 5, 2019·57 cites

DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

Hanchao Li, Pengfei Xiong, Haoqiang Fan, Jian Sun

PDF

Open Access 2 Repos

TL;DR

DFANet is a highly efficient CNN architecture designed for real-time semantic segmentation, balancing speed and accuracy by reducing parameters and FLOPs while maintaining strong performance on benchmark datasets.

Contribution

The paper introduces DFANet, a lightweight CNN with multi-scale feature aggregation that achieves state-of-the-art real-time segmentation performance with significantly fewer computational resources.

Findings

01

Achieves 70.3% Mean IOU on Cityscapes with 1.7 GFLOPs

02

Runs at 160 FPS on a Titan X GPU

03

Uses 8× fewer FLOPs than previous methods

Abstract

This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints. Our proposed network starts from a single lightweight backbone and aggregates discriminative features through sub-network and sub-stage cascade respectively. Based on the multi-scale feature propagation, DFANet substantially reduces the number of parameters, but still obtains sufficient receptive field and enhances the model learning ability, which strikes a balance between the speed and segmentation performance. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of DFANet with 8 $\times$ less FLOPs and 2 $\times$ faster than the existing state-of-the-art real-time semantic segmentation methods while providing comparable accuracy. Specifically, it achieves 70.3\% Mean IOU on the Cityscapes test dataset with only 1.7 GFLOPs and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings