Visual Saliency Based on Multiscale Deep Features
Guanbin Li, Yizhou Yu

TL;DR
This paper presents a novel deep learning-based approach for visual saliency detection that leverages multiscale CNN features, a refinement process, and multiple segmentation levels, achieving state-of-the-art results and introducing a new large dataset.
Contribution
The paper introduces a multiscale CNN feature-based neural network architecture for saliency detection, along with a refinement method and multi-level segmentation aggregation, plus a new challenging dataset.
Findings
Achieves state-of-the-art performance on public benchmarks.
Improves F-Measure by 5.0% and 13.2% on MSRA-B and HKU-IS datasets.
Reduces mean absolute error by 5.7% and 35.1% on the same datasets.
Abstract
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this CVPR 2015 paper, we discover that a high-quality visual saliency model can be trained with multiscale features extracted using a popular deep learning architecture, convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for extracting features at three different scales. We then propose a refinement method to enhance the spatial coherence of our saliency results. Finally, aggregating multiple saliency maps computed for different levels of image segmentation can further boost the performance, yielding saliency maps better than those generated from a single segmentation. To promote further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Image and Video Quality Assessment
