TL;DR
This paper presents a method to improve image-based mutual gaze detection by incorporating a pseudo 3D gaze estimation task during training, leveraging shared features to enhance performance without extra labeling.
Contribution
It introduces a novel auxiliary 3D gaze estimation approach using pseudo labels to boost mutual gaze detection accuracy without additional annotation costs.
Findings
Significant performance improvement on three datasets.
Effective use of pseudo 3D gaze labels during training.
New dataset with 33.1K human pairs annotated with mutual gaze.
Abstract
Mutual gaze detection, i.e., predicting whether or not two people are looking at each other, plays an important role in understanding human interactions. In this work, we focus on the task of image-based mutual gaze detection, and propose a simple and effective approach to boost the performance by using an auxiliary 3D gaze estimation task during the training phase. We achieve the performance boost without additional labeling cost by training the 3D gaze estimation branch using pseudo 3D gaze labels deduced from mutual gaze labels. By sharing the head image encoder between the 3D gaze estimation and the mutual gaze detection branches, we achieve better head features than learned by training the mutual gaze detection branch alone. Experimental results on three image datasets show that the proposed approach improves the detection performance significantly without additional annotations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
