DeepGaze IIE: Calibrated prediction in and out-of-domain for   state-of-the-art saliency modeling

Akis Linardos; Matthias K\"ummerer; Ori Press; Matthias Bethge

arXiv:2105.12441·cs.LG·September 21, 2021

DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling

Akis Linardos, Matthias K\"ummerer, Ori Press, Matthias Bethge

PDF

Open Access 2 Repos

TL;DR

DeepGaze IIE advances saliency prediction by combining multiple backbones for better calibration and out-of-domain performance, achieving state-of-the-art results on key benchmarks.

Contribution

This work introduces DeepGaze IIE, a novel model that combines multiple ImageNet backbones for improved calibration and generalization in saliency prediction.

Findings

01

Replacing VGG19 with ResNet50 improves performance from 78% to 85%.

02

Combining multiple backbones achieves 93% on MIT1003, a new state-of-the-art.

03

Models are overconfident in fixation predictions across datasets.

Abstract

Since 2014 transfer learning has become the key driver for the improvement of spatial saliency prediction; however, with stagnant progress in the last 3-5 years. We conduct a large-scale transfer learning study which tests different ImageNet backbones, always using the same read out architecture and learning protocol adopted from DeepGaze II. By replacing the VGG19 backbone of DeepGaze II with ResNet50 features we improve the performance on saliency prediction from 78% to 85%. However, as we continue to test better ImageNet models as backbones (such as EfficientNetB5) we observe no additional improvement on saliency prediction. By analyzing the backbones further, we find that generalization to other datasets differs substantially, with models being consistently overconfident in their fixation predictions. We show that by combining multiple backbones in a principled manner a good…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection