Learning Video-independent Eye Contact Segmentation from In-the-Wild Videos
Tianyi Wu, Yusuke Sugano

TL;DR
This paper introduces a novel unsupervised method for video-independent eye contact segmentation in-the-wild, utilizing temporal modeling and pseudo-labeling to improve accuracy on real-world videos.
Contribution
It proposes a unified, video-independent eye contact detection model trained with pseudo-labels, addressing data scarcity and variability in real-world scenarios.
Findings
Achieves 71.88% framewise accuracy on annotated test videos.
Outperforms previous video-dependent eye contact detectors.
Introduces a gaze target discovery method for pseudo-label generation.
Abstract
Human eye contact is a form of non-verbal communication and can have a great influence on social behavior. Since the location and size of the eye contact targets vary across different videos, learning a generic video-independent eye contact detector is still a challenging task. In this work, we address the task of one-way eye contact detection for videos in the wild. Our goal is to build a unified model that can identify when a person is looking at his gaze targets in an arbitrary input video. Considering that this requires time-series relative eye movement information, we propose to formulate the task as a temporal segmentation. Due to the scarcity of labeled training data, we further propose a gaze target discovery method to generate pseudo-labels for unlabeled videos, which allows us to train a generic eye contact segmentation model in an unsupervised way using in-the-wild videos. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Retinal Imaging and Analysis
MethodsTest
