Simple and Efficient Architectures for Semantic Segmentation

Dushyant Mehta; Andrii Skliar; Haitam Ben Yahia; Shubhankar Borse,; Fatih Porikli; Amirhossein Habibian; Tijmen Blankevoort

arXiv:2206.08236·cs.CV·June 17, 2022·5 cites

Simple and Efficient Architectures for Semantic Segmentation

Dushyant Mehta, Andrii Skliar, Haitam Ben Yahia, Shubhankar Borse,, Fatih Porikli, Amirhossein Habibian, Tijmen Blankevoort

PDF

Open Access 1 Repo 5 Models

TL;DR

This paper demonstrates that simple encoder-decoder architectures with modified ResNet backbones can match or outperform complex models in semantic segmentation, offering efficient and practical solutions for both desktop and mobile applications.

Contribution

The authors introduce simple, efficient encoder-decoder architectures with enlarged receptive fields that outperform complex models like HRNet, using minor modifications to ResNet backbones.

Findings

01

Simple architectures match or surpass complex models on Cityscapes.

02

Enlarging receptive fields with minor modifications improves segmentation performance.

03

Proposed models are suitable for both desktop and mobile devices.

Abstract

Though the state-of-the architectures for semantic segmentation, such as HRNet, demonstrate impressive accuracy, the complexity arising from their salient design choices hinders a range of model acceleration tools, and further they make use of operations that are inefficient on current hardware. This paper demonstrates that a simple encoder-decoder architecture with a ResNet-like backbone and a small multi-scale head, performs on-par or better than complex semantic segmentation architectures such as HRNet, FANet and DDRNets. Naively applying deep backbones designed for Image Classification to the task of Semantic Segmentation leads to sub-par results, owing to a much smaller effective receptive field of these backbones. Implicit among the various design choices put forth in works like HRNet, DDRNet, and FANet are networks with a large effective receptive field. It is natural to ask if a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qualcomm-ai-research/ffnet
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Batch Normalization · Convolution · HRNet