Dilated SpineNet for Semantic Segmentation

Abdullah Rashwan; Xianzhi Du; Xiaoqi Yin; Jing Li

arXiv:2103.12270·cs.CV·March 24, 2021·6 cites

Dilated SpineNet for Semantic Segmentation

Abdullah Rashwan, Xianzhi Du, Xiaoqi Yin, Jing Li

PDF

Open Access

TL;DR

This paper introduces SpineNet-Seg, a NAS-discovered, scale-permuted network with dilated convolutions that significantly improves semantic segmentation accuracy across multiple benchmarks, including Cityscapes and PASCAL VOC2012.

Contribution

The paper proposes SpineNet-Seg, a novel NAS-designed, scale-permuted network with customized dilation ratios for semantic segmentation, outperforming existing baselines.

Findings

01

Achieves 83.04% mIoU on Cityscapes

02

Attains 85.56% mIoU on PASCAL VOC2012

03

Outperforms DeepLabv3/v3+ baselines in speed and accuracy

Abstract

Scale-permuted networks have shown promising results on object bounding box detection and instance segmentation. Scale permutation and cross-scale fusion of features enable the network to capture multi-scale semantics while preserving spatial resolution. In this work, we evaluate this meta-architecture design on semantic segmentation - another vision task that benefits from high spatial resolution and multi-scale feature fusion at different network stages. By further leveraging dilated convolution operations, we propose SpineNet-Seg, a network discovered by NAS that is searched from the DeepLabv3 system. SpineNet-Seg is designed with a better scale-permuted network topology with customized dilation ratios per block on a semantic segmentation task. SpineNet-Seg models outperform the DeepLabv3/v3+ baselines at all model scales on multiple popular benchmarks in speed and accuracy. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Automated Road and Building Extraction · Video Surveillance and Tracking Methods

Methods1x1 Convolution · Spatial Pyramid Pooling · Batch Normalization · Atrous Spatial Pyramid Pooling · Convolution · DeepLabv3 · Dilated Convolution