Weakly Supervised Attended Object Detection Using Gaze Data as Annotations
Michele Mazzamuto, Francesco Ragusa, Antonino Furnari, Giovanni, Signorello, Giovanni Maria Farinella

TL;DR
This paper introduces a weakly supervised method for attended object detection in egocentric museum videos using gaze data, reducing labeling costs and providing a new dataset and baseline comparisons.
Contribution
It proposes a weakly supervised approach leveraging gaze data and frame labels, along with a new dataset, to detect attended objects without extensive annotations.
Findings
Achieves satisfactory detection performance with weak supervision.
Reduces time and cost compared to fully supervised methods.
Provides a new dataset and baseline results for future research.
Abstract
We consider the problem of detecting and recognizing the objects observed by visitors (i.e., attended objects) in cultural sites from egocentric vision. A standard approach to the problem involves detecting all objects and selecting the one which best overlaps with the gaze of the visitor, measured through a gaze tracker. Since labeling large amounts of data to train a standard object detector is expensive in terms of costs and time, we propose a weakly supervised version of the task which leans only on gaze data and a frame-level label indicating the class of the attended object. To study the problem, we present a new dataset composed of egocentric videos and gaze coordinates of subjects visiting a museum. We hence compare three different baselines for weakly supervised attended object detection on the collected data. Results show that the considered approaches achieve satisfactory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Domain Adaptation and Few-Shot Learning · Indoor and Outdoor Localization Technologies
MethodsSoftmax · Region Proposal Network · RoIPool · Convolution · Faster R-CNN
