Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for   Scene Segmentation

Soham Chattopadhyay; Hritam Basak

arXiv:2009.06911·cs.CV·September 16, 2020·5 cites

Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation

Soham Chattopadhyay, Hritam Basak

PDF

Open Access

TL;DR

This paper introduces a multi-scale attention U-Net (MsAUNet) that enhances scene segmentation by integrating attention gates and a compound loss function, leading to improved accuracy and faster convergence on standard datasets.

Contribution

The paper presents a novel multi-scale attention mechanism within a modified U-Net architecture and a combined loss function for better scene segmentation performance.

Findings

01

Achieved 79.88% mean IoU on PascalVOC2012

02

Achieved 44.88% mean IoU on ADE20k

03

Outperformed existing models in segmentation accuracy

Abstract

Despite the growing success of Convolution neural networks (CNN) in the recent past in the task of scene segmentation, the standard models lack some of the important features that might result in sub-optimal segmentation outputs. The widely used encoder-decoder architecture extracts and uses several redundant and low-level features at different steps and different scales. Also, these networks fail to map the long-range dependencies of local features, which results in discriminative feature maps corresponding to each semantic class in the resulting segmented image. In this paper, we propose a novel multi-scale attention network for scene segmentation purposes by using the rich contextual information from an image. Different from the original UNet architecture we have used attention gates which take the features from the encoder and the output of the pyramid pool as input and produced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications

MethodsDice Loss · Convolution