Visual Attention driven by Convolutional Features
Dario Zanca, Marco Gori

TL;DR
This paper presents a deep learning-based model for visual attention prediction that uses convolutional neural networks and saliency maps to simulate human eye-movement scanpaths, showing promising results.
Contribution
The paper introduces a novel approach combining CNN-derived saliency maps with a differential eye-movement model to predict visual attention and scanpaths.
Findings
Effective saliency prediction demonstrated by high similarity scores with human scanpaths.
Saliency maps based on CNN features outperform traditional methods.
Model successfully simulates human-like eye-movement patterns.
Abstract
The understanding of where humans look in a scene is a problem of great interest in visual perception and computer vision. When eye-tracking devices are not a viable option, models of human attention can be used to predict fixations. In this paper we give two contribution. First, we show a model of visual attention that is simply based on deep convolutional neural networks trained for object classification tasks. A method for visualizing saliency maps is defined which is evaluated in a saliency prediction task. Second, we integrate the information of these maps with a bottom-up differential model of eye-movements to simulate visual attention scanpaths. Results on saliency prediction and scores of similarity with human scanpaths demonstrate the effectiveness of this model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Olfactory and Sensory Function Studies
