A Self Validation Network for Object-Level Human Attention Estimation

Zehua Zhang; Chen Yu; David Crandall

arXiv:1910.14260·cs.CV·December 17, 2019·1 cites

A Self Validation Network for Object-Level Human Attention Estimation

Zehua Zhang, Chen Yu, David Crandall

PDF

Open Access 1 Repo

TL;DR

This paper introduces a unified model with a Self Validation Module for accurately estimating the object of human attention in first-person videos by integrating spatial and temporal cues, outperforming existing methods.

Contribution

It presents a novel Self Validation Module that enforces consistency between spatial and temporal evidence for improved attention estimation in egocentric videos.

Findings

01

The model outperforms state-of-the-art methods on two public datasets.

02

Self Validation Module enhances both training and testing performance.

03

The approach effectively integrates spatial and temporal information for attention estimation.

Abstract

Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object. Estimating this object of attention in first-person (egocentric) videos is useful for many human-centered real-world applications such as augmented reality applications and driver assistance systems. A straightforward solution for this problem is to pick the object whose bounding box is hit by the gaze, where eye gaze point estimation is obtained from a traditional eye gaze estimator and object candidates are generated from an off-the-shelf object detector. However, such an approach can fail because it addresses the where and the what problems separately, despite that they are highly related, chicken-and-egg problems. In this paper, we propose a novel unified model that incorporates both spatial and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zehzhang/MindreaderNet-Mr.-Net-
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Retinal Imaging and Analysis