Real-time Semantic Image Segmentation via Spatial Sparsity

Zifeng Wu; Chunhua Shen; Anton van den Hengel

arXiv:1712.00213·cs.CV·December 4, 2017·53 cites

Real-time Semantic Image Segmentation via Spatial Sparsity

Zifeng Wu, Chunhua Shen, Anton van den Hengel

PDF

Open Access

TL;DR

This paper introduces a real-time semantic image segmentation method that significantly reduces computational costs by exploiting spatial sparsity, enabling high-speed processing with minimal quality loss.

Contribution

The approach employs a two-column network with spatial sparsity to achieve 25x faster segmentation while maintaining competitive accuracy.

Findings

01

Processes 15 high-resolution images per second

02

Achieves 72.9% mean intersection-over-union on Cityscapes

03

Reduces computational costs by a factor of 25

Abstract

We propose an approach to semantic (image) segmentation that reduces the computational costs by a factor of 25 with limited impact on the quality of results. Semantic segmentation has a number of practical applications, and for most such applications the computational costs are critical. The method follows a typical two-column network structure, where one column accepts an input image, while the other accepts a half-resolution version of that image. By identifying specific regions in the full-resolution image that can be safely ignored, as well as carefully tailoring the network structure, we can process approximately 15 highresolution Cityscapes images (1024x2048) per second using a single GTX 980 video card, while achieving a mean intersection-over-union score of 72.9% on the Cityscapes test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection