Pyramid Scene Parsing Network

Hengshuang Zhao; Jianping Shi; Xiaojuan Qi; Xiaogang Wang; Jiaya Jia

arXiv:1612.01105·cs.CV·April 28, 2017·314 cites

Pyramid Scene Parsing Network

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia

PDF

Open Access 5 Repos

TL;DR

The paper introduces PSPNet, a novel neural network architecture utilizing pyramid pooling for scene parsing, achieving state-of-the-art accuracy on multiple datasets and winning several benchmarks.

Contribution

It proposes a pyramid pooling module within PSPNet that effectively captures global context for improved pixel-level scene parsing.

Findings

01

Achieved 85.4% mIoU on PASCAL VOC 2012

02

Achieved 80.2% accuracy on Cityscapes

03

Won ImageNet scene parsing challenge 2016

Abstract

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-region-based context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixel-level prediction tasks. The proposed approach achieves state-of-the-art performance on various datasets. It came first in ImageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsAverage Pooling · Auxiliary Classifier · Pyramid Pooling Module · Fully Convolutional Network · Random Gaussian Blur · RandomRotate · Random Horizontal Flip · Weight Decay · SGD with Momentum · Polynomial Rate Decay