Pyramid Attention Network for Semantic Segmentation

Hanchao Li; Pengfei Xiong; Jie An; Lingxue Wang

arXiv:1805.10180·cs.CV·November 27, 2018·235 cites

Pyramid Attention Network for Semantic Segmentation

Hanchao Li, Pengfei Xiong, Jie An, Lingxue Wang

PDF

Open Access

TL;DR

This paper introduces Pyramid Attention Network (PAN), which combines attention mechanisms and spatial pyramids to improve semantic segmentation by effectively capturing global context and precise features, achieving state-of-the-art results.

Contribution

The paper presents a novel Pyramid Attention Network that integrates attention modules with spatial pyramids, avoiding complex dilated convolutions and designed decoders, to enhance semantic segmentation performance.

Findings

01

Achieved 84.0% mIoU on PASCAL VOC 2012 without COCO training.

02

Outperformed existing methods on PASCAL VOC 2012 and Cityscapes benchmarks.

03

Introduced Feature Pyramid Attention and Global Attention Upsample modules.

Abstract

A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation. Different from most existing works, we combine attention mechanism and spatial pyramid to extract precise dense features for pixel labeling instead of complicated dilated convolution and artificially designed decoder networks. Specifically, we introduce a Feature Pyramid Attention module to perform spatial pyramid attention structure on high-level output and combining global pooling to learn a better feature representation, and a Global Attention Upsample module on each decoder layer to provide global context as a guidance of low-level features to select category localization details. The proposed approach achieves state-of-the-art performance on PASCAL VOC 2012 and Cityscapes benchmarks with a new record of mIoU accuracy 84.0% on PASCAL VOC 2012, while training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications