A Modular Multimodal Architecture for Gaze Target Prediction: Application to Privacy-Sensitive Settings
Anshul Gupta, Samy Tafasca, Jean-Marc Odobez

TL;DR
This paper introduces a modular multimodal architecture for gaze target prediction that leverages depth and pose cues to improve accuracy, especially in privacy-sensitive contexts like surveillance and health.
Contribution
The paper presents a novel modular architecture that combines multimodal cues with attention mechanisms for improved gaze prediction in privacy-sensitive scenarios.
Findings
Achieved state-of-the-art results on GazeFollow and VideoAttentionTarget datasets.
Demonstrated competitive performance in privacy-sensitive settings.
Validated the effectiveness of explicit multimodal cue integration.
Abstract
Predicting where a person is looking is a complex task, requiring to understand not only the person's gaze and scene content, but also the 3D scene structure and the person's situation (are they manipulating? interacting or observing others? attentive?) to detect obstructions in the line of sight or apply attention priors that humans typically have when observing others. In this paper, we hypothesize that identifying and leveraging such priors can be better achieved through the exploitation of explicitly derived multimodal cues such as depth and pose. We thus propose a modular multimodal architecture allowing to combine these cues using an attention mechanism. The architecture can naturally be exploited in privacy-sensitive situations such as surveillance and health, where personally identifiable information cannot be released. We perform extensive experiments on the GazeFollow and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Video Surveillance and Tracking Methods · Gait Recognition and Analysis
