Implicit Saliency in Deep Neural Networks
Yutong Sun, Mohit Prabhushankar, Ghassan AlRegib

TL;DR
This paper demonstrates that deep neural networks trained for recognition can implicitly predict human visual saliency without explicit saliency training, outperforming supervised methods especially under noisy conditions.
Contribution
It introduces the concept of implicit saliency in deep networks and shows how to extract it unsupervisedly, revealing insights into feature contributions and robustness.
Findings
Implicit saliency prediction is comparable to supervised methods.
Implicit saliency is more robust to noise than supervised algorithms.
Semantic features contribute more to saliency detection than low-level features.
Abstract
In this paper, we show that existing recognition and localization deep architectures, that have not been exposed to eye tracking data or any saliency datasets, are capable of predicting the human visual saliency. We term this as implicit saliency in deep neural networks. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms. Additionally, the robustness outperforms those algorithms when we add large noise to the input images. Also, we show that semantic features contribute more than low-level features for human visual saliency detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · CCD and CMOS Imaging Sensors
