Detecting Engagement in Egocentric Video
Yu-Chuan Su, Kristen Grauman

TL;DR
This paper introduces a learning-based method to detect moments of engagement in egocentric videos using egomotion cues, supported by a new annotated dataset, advancing understanding of user focus in wearable camera footage.
Contribution
It presents the first dataset for ego-engagement detection and a novel approach leveraging egomotion cues to identify engagement moments in egocentric videos.
Findings
Our method outperforms existing approaches.
Engagement detection is robust across different scenes.
Detection is independent of scene appearance and user identity.
Abstract
In a wearable camera video, we see what the camera wearer sees. While this makes it easy to know roughly what he chose to look at, it does not immediately reveal when he was engaged with the environment. Specifically, at what moments did his focus linger, as he paused to gather more information about something he saw? Knowing this answer would benefit various applications in video summarization and augmented reality, yet prior work focuses solely on the "what" question (estimating saliency, gaze) without considering the "when" (engagement). We propose a learning-based approach that uses long-term egomotion cues to detect engagement, specifically in browsing scenarios where one frequently takes in new visual information (e.g., shopping, touring). We introduce a large, richly annotated dataset for ego-engagement that is the first of its kind. Our approach outperforms a wide array of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Visual Attention and Saliency Detection · Multimodal Machine Learning Applications
