Recurrent Attentional Networks for Saliency Detection

Jason Kuen; Zhenhua Wang; Gang Wang

arXiv:1604.03227·cs.CV·April 13, 2016·52 cites

Recurrent Attentional Networks for Saliency Detection

Jason Kuen, Zhenhua Wang, Gang Wang

PDF

Open Access

TL;DR

This paper introduces RACDNN, a recurrent attentional network that improves saliency detection by iteratively focusing on image regions and learning context-aware features, outperforming existing methods.

Contribution

The novel RACDNN model combines spatial transformers and recurrent units to address scale variation and context modeling in saliency detection.

Findings

01

RACDNN outperforms state-of-the-art methods on multiple datasets.

02

It effectively handles objects of multiple scales.

03

The model demonstrates improved saliency refinement through iterative attention.

Abstract

Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection. But, they do not work well with objects of multiple scales. To overcome such a limitation, in this work, we propose a recurrent attentional convolutional-deconvolution network (RACDNN). Using spatial transformer and recurrent network units, RACDNN is able to iteratively attend to selected image sub-regions to perform saliency refinement progressively. Besides tackling the scale problem, RACDNN can also learn context-aware features from past iterations to enhance saliency refinement in future iterations. Experiments on several challenging saliency detection datasets validate the effectiveness of RACDNN, and show that RACDNN outperforms state-of-the-art saliency detection methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Olfactory and Sensory Function Studies · Image and Video Quality Assessment

MethodsSpatial Transformer