Gaze-based Object Detection in the Wild

Daniel Weber; Wolfgang Fuhl; Andreas Zell; Enkelejda Kasneci

arXiv:2203.15651·cs.RO·January 26, 2023

Gaze-based Object Detection in the Wild

Daniel Weber, Wolfgang Fuhl, Andreas Zell, Enkelejda Kasneci

PDF

Open Access

TL;DR

This paper explores gaze-based object detection in realistic human-robot interaction scenarios, using heatmaps derived from gaze data and machine learning to identify objects and their bounding boxes efficiently.

Contribution

It introduces a novel gaze-based detection method utilizing heatmaps with variable temporal windows and grid sizes, demonstrating speed and resource efficiency over traditional detectors.

Findings

01

Effective object detection from gaze heatmaps in real-world scenarios

02

Method achieves high speed and low resource usage

03

Public dataset available for further research

Abstract

In human-robot collaboration, one challenging task is to teach a robot new yet unknown objects enabling it to interact with them. Thereby, gaze can contain valuable information. We investigate if it is possible to detect objects (object or no object) merely from gaze data and determine their bounding box parameters. For this purpose, we explore different sizes of temporal windows, which serve as a basis for the computation of heatmaps, i.e., the spatial distribution of the gaze data. Additionally, we analyze different grid sizes of these heatmaps, and demonstrate the functionality in a proof of concept using different machine learning techniques. Our method is characterized by its speed and resource efficiency compared to conventional object detectors. In order to generate the required data, we conducted a study with five subjects who could move freely and thus, turn towards arbitrary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Robotics and Sensor-Based Localization · Visual Attention and Saliency Detection

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings