Predicting the Category and Attributes of Visual Search Targets Using   Deep Gaze Pooling

Hosnieh Sattar; Andreas Bulling; Mario Fritz

arXiv:1611.10162·cs.CV·April 4, 2017·1 cites

Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling

Hosnieh Sattar, Andreas Bulling, Mario Fritz

PDF

Open Access

TL;DR

This paper introduces a novel gaze pooling layer for CNNs that predicts categories and attributes of visual search targets from eye gaze data, improving accuracy without extensive retraining.

Contribution

The authors propose a Gaze Pooling Layer that integrates gaze data into CNNs as an attention mechanism, enabling effective search target prediction with minimal retraining.

Findings

01

Effective gaze-based search target prediction demonstrated

02

Gaze pooling layer improves recognition accuracy

03

Method works with pre-trained CNNs without retraining

Abstract

Predicting the target of visual search from eye fixation (gaze) data is a challenging problem with many applications in human-computer interaction. In contrast to previous work that has focused on individual instances as a search target, we propose the first approach to predict categories and attributes of search targets based on gaze data. However, state of the art models for categorical recognition, in general, require large amounts of training data, which is prohibitive for gaze data. To address this challenge, we propose a novel Gaze Pooling Layer that integrates gaze information into CNN-based architectures as an attention mechanism - incorporating both spatial and temporal aspects of human gaze behavior. We show that our approach is effective even when the gaze pooling layer is added to an already trained CNN, thus eliminating the need for expensive joint data collection of visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Retinal Imaging and Analysis