DeepRare: Generic Unsupervised Visual Attention Models

Phutphalla Kong; Matei Mancas; Bernard Gosselin; Kimtho Po

arXiv:2109.11439·cs.CV·September 24, 2021·1 cites

DeepRare: Generic Unsupervised Visual Attention Models

Phutphalla Kong, Matei Mancas, Bernard Gosselin, Kimtho Po

PDF

Open Access

TL;DR

DeepRare2021 is a universal, unsupervised visual attention model that combines deep learning features with engineered algorithms, demonstrating high efficiency and generality across diverse datasets without requiring training.

Contribution

It introduces DeepRare2021, a training-free, fast, and generic visual attention model that outperforms many existing models across multiple datasets and architectures.

Findings

01

DeepRare2021 achieves top performance on various eye-tracking datasets.

02

The model is architecture-agnostic, working well with VGG16, VGG19, and MobileNetV2.

03

It provides interpretability by highlighting surprising image regions.

Abstract

Human visual system is modeled in engineering field providing feature-engineered methods which detect contrasted/surprising/unusual data into images. This data is "interesting" for humans and leads to numerous applications. Deep learning (DNNs) drastically improved the algorithms efficiency on the main benchmark datasets. However, DNN-based models are counter-intuitive: surprising or unusual data is by definition difficult to learn because of its low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this paper, we propose a new visual attention model called DeepRare2021 (DR21) which uses the power of DNNs feature extraction and the genericity of feature-engineered algorithms. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Retinal Imaging and Analysis

MethodsDepthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Batch Normalization · Inverted Residual Block · 1x1 Convolution · Average Pooling · Convolution