CNN-based search model underestimates attention guidance by simple   visual features

Endel Poder

arXiv:2103.15439·cs.CV·April 27, 2021·1 cites

CNN-based search model underestimates attention guidance by simple visual features

Endel Poder

PDF

Open Access

TL;DR

This paper evaluates a CNN-based attention guidance model in visual search tasks and finds it underestimates human attention guidance due to lack of bottom-up guidance and potentially inadequate feature learning.

Contribution

The study adapts a CNN-based attention model for search experiments and demonstrates its limitations in replicating human attention guidance.

Findings

01

CNN model underestimates human attention guidance

02

Lacks bottom-up guidance in the model

03

Standard CNNs may not learn features needed for human-like attention

Abstract

Recently, Zhang et al. (2018) proposed an interesting model of attention guidance that uses visual features learnt by convolutional neural networks for object recognition. I adapted this model for search experiments with accuracy as the measure of performance. Simulation of our previously published feature and conjunction search experiments revealed that CNN-based search model considerably underestimates human attention guidance by simple visual features. A simple explanation is that the model has no bottom-up guidance of attention. Another view might be that standard CNNs do not learn features required for human-like attention guidance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Visual perception and processing mechanisms · Infrared Target Detection Methodologies