Saliency Prediction in the Deep Learning Era: Successes, Limitations, and Future Challenges
Ali Borji

TL;DR
This paper reviews recent advances in deep learning-based visual saliency models, highlighting their successes, limitations, and future challenges in achieving human-level accuracy across image and video datasets.
Contribution
It provides a comprehensive review of new deep saliency models, benchmarks, datasets, and discusses factors behind the gap with human attention, proposing directions for future research.
Findings
Deep models outperform traditional methods but still lag behind humans.
Benchmark comparisons reveal specific failure modes of current models.
Identifies key factors and challenges for developing next-generation saliency models.
Abstract
Visual saliency models have enjoyed a big leap in performance in recent years, thanks to advances in deep learning and large scale annotated data. Despite enormous effort and huge breakthroughs, however, models still fall short in reaching human-level accuracy. In this work, I explore the landscape of the field emphasizing on new deep saliency models, benchmarks, and datasets. A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets. Further, I identify factors that contribute to the gap between models and humans and discuss remaining issues that need to be addressed to build the next generation of more powerful saliency models. Some specific questions that are addressed include: in what ways current models fail, how to remedy them, what can be learned from cognitive studies of attention, how explicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Face Recognition and Perception · Image and Video Quality Assessment
