Dual Attention Network for Scene Segmentation
Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang,, Hanqing Lu

TL;DR
This paper introduces Dual Attention Networks (DANet) that enhance scene segmentation by capturing rich contextual dependencies through spatial and channel attention modules, achieving state-of-the-art results on multiple datasets.
Contribution
The paper proposes a novel dual attention mechanism that adaptively integrates local features with global dependencies, improving segmentation accuracy over previous multi-scale fusion methods.
Findings
Achieved 81.5% Mean IoU on Cityscapes test set.
Outperformed existing methods on PASCAL Context and COCO Stuff datasets.
Demonstrated effectiveness of spatial and channel attention modules in scene segmentation.
Abstract
In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on the selfattention mechanism. Unlike previous works that capture contexts by multi-scale features fusion, we propose a Dual Attention Networks (DANet) to adaptively integrate local features with their global dependencies. Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively. The position attention module selectively aggregates the features at each position by a weighted sum of the features at all positions. Similar features would be related to each other regardless of their distances. Meanwhile, the channel attention module selectively emphasizes interdependent channel maps by integrating associated features among all channel maps. We sum the outputs of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsDual Attention Network · Average Pooling · Fully Convolutional Network · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block
