Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection
Heng Fan, Peng Chu, Longin Jan Latecki, Haibin Ling

TL;DR
This paper introduces dense recurrent neural networks with an attention mechanism for scene parsing, capturing richer long-range dependencies and improving accuracy over existing methods.
Contribution
It proposes a novel dense RNN architecture with an attention model to enhance contextual dependency modeling in scene labeling.
Findings
Significant performance improvements on three large-scale benchmarks.
Outperforms existing state-of-the-art scene parsing algorithms.
Effective integration of dense RNNs with CNNs for end-to-end training.
Abstract
Recurrent neural networks (RNNs) have shown the ability to improve scene parsing through capturing long-range dependencies among image units. In this paper, we propose dense RNNs for scene labeling by exploring various long-range semantic dependencies among image units. Different from existing RNN based approaches, our dense RNNs are able to capture richer contextual dependencies for each image unit by enabling immediate connections between each pair of image units, which significantly enhances their discriminative power. Besides, to select relevant dependencies and meanwhile to restrain irrelevant ones for each unit from dense connections, we introduce an attention model into dense RNNs. The attention model allows automatically assigning more importance to helpful dependencies while less weight to unconcerned dependencies. Integrating with convolutional neural networks (CNNs), we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
