DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations
Srinivas S. S. Kruthiventi, Kumar Ayush, R. Venkatesh Babu

TL;DR
DeepFix is a novel fully convolutional neural network that predicts human eye fixations by automatically learning hierarchical features and incorporating location bias, outperforming previous methods on standard datasets.
Contribution
We introduce DeepFix, the first fully convolutional network for saliency prediction that captures multi-scale semantics and models location bias end-to-end.
Findings
DeepFix outperforms recent approaches on MIT300 and CAT2000 datasets.
The model effectively captures multi-scale semantic information.
Incorporating location bias improves saliency prediction accuracy.
Abstract
Understanding and predicting the human visual attentional mechanism is an active area of research in the fields of neuroscience and computer vision. In this work, we propose DeepFix, a first-of-its-kind fully convolutional neural network for accurate saliency prediction. Unlike classical works which characterize the saliency map using various hand-crafted features, our model automatically learns features in a hierarchical fashion and predicts saliency map in an end-to-end manner. DeepFix is designed to capture semantics at multiple scales while taking global context into account using network layers with very large receptive fields. Generally, fully convolutional nets are spatially invariant which prevents them from modeling location dependent patterns (e.g. centre-bias). Our network overcomes this limitation by incorporating a novel Location Biased Convolutional layer. We evaluate our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
