Realtime Global Attention Network for Semantic Segmentation
Xi Mo, Xiangyu Chen

TL;DR
This paper introduces RGANet, a real-time global attention neural network for semantic segmentation that uses depth-wise convolution and affine transformations, achieving state-of-the-art performance with an improved evaluation metric.
Contribution
The paper presents a novel global attention module integrated into a hierarchy architecture for real-time semantic segmentation, along with a new evaluation metric MGRID.
Findings
RGANet achieves leading performance on semantic segmentation benchmarks.
The global attention module maintains high inference speed.
MGRID improves evaluation accuracy for scattered ground-truth areas.
Abstract
In this paper, we proposed an end-to-end realtime global attention neural network (RGANet) for the challenging task of semantic segmentation. Different from the encoding strategy deployed by self-attention paradigms, the proposed global attention module encodes global attention via depth-wise convolution and affine transformations. The integration of these global attention modules into a hierarchy architecture maintains high inferential performance. In addition, an improved evaluation metric, namely MGRID, is proposed to alleviate the negative effect of non-convex, widely scattered ground-truth areas. Results from extensive experiments on state-of-the-art architectures for semantic segmentation manifest the leading performance of proposed approaches for robotic monocular visual perception.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Robotics and Sensor-Based Localization
