TL;DR
This paper introduces a deep multi-level neural network architecture for saliency prediction that combines features from different CNN levels, outperforming existing models on large benchmark datasets.
Contribution
The paper proposes a novel multi-level feature integration architecture for saliency prediction, improving upon current state-of-the-art models by combining features at various CNN levels.
Findings
Outperforms state-of-the-art models on SALICON dataset
Achieves competitive results on MIT300 benchmark
Outperforms in all evaluation metrics on SALICON
Abstract
This paper presents a novel deep architecture for saliency prediction. Current state of the art models for saliency prediction employ Fully Convolutional networks that perform a non-linear combination of features extracted from the last convolutional layer to predict saliency maps. We propose an architecture which, instead, combines features extracted at different levels of a Convolutional Neural Network (CNN). Our model is composed of three main blocks: a feature extraction CNN, a feature encoding network, that weights low and high level feature maps, and a prior learning network. We compare our solution with state of the art saliency models on two public benchmarks datasets. Results show that our model outperforms under all evaluation metrics on the SALICON dataset, which is currently the largest public dataset for saliency prediction, and achieves competitive results on the MIT300…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
