Hierarchical Multi-Scale Attention for Semantic Segmentation
Andrew Tao, Karan Sapra, Bryan Catanzaro

TL;DR
This paper introduces a hierarchical multi-scale attention mechanism for semantic segmentation that improves accuracy and training efficiency by adaptively combining predictions at different scales, achieving state-of-the-art results.
Contribution
The paper presents a novel hierarchical attention approach for multi-scale prediction fusion, enhancing efficiency and accuracy in semantic segmentation tasks.
Findings
Achieves state-of-the-art results on Cityscapes and Mapillary datasets.
Enables 4x memory efficiency during training.
Improves generalization through auto-labelling on Cityscapes.
Abstract
Multi-scale inference is commonly used to improve the results of semantic segmentation. Multiple images scales are passed through a network and then the results are combined with averaging or max pooling. In this work, we present an attention-based approach to combining multi-scale predictions. We show that predictions at certain scales are better at resolving particular failures modes, and that the network learns to favor those scales for such cases in order to generate better predictions. Our attention mechanism is hierarchical, which enables it to be roughly 4x more memory efficient to train than other recent approaches. In addition to enabling faster training, this allows us to train with larger crop sizes which leads to greater model accuracy. We demonstrate the result of our method on two datasets: Cityscapes and Mapillary Vistas. For Cityscapes, which has a large number of weakly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
