Restoring Eye Contact to the Virtual Classroom with Machine Learning
Ross Greer, Shlomo Dubnov

TL;DR
This paper introduces a machine learning system that estimates gaze targets from single camera images to restore eye contact in virtual classrooms, enhancing nonverbal communication and collaboration.
Contribution
It presents a modular gaze estimation system that predicts target-oriented gaze, demonstrated through a pilot study in virtual music education settings.
Findings
Improved cue interpretation success in virtual classroom
Enhanced student-reported collaboration and communication
Achieved inference speed and accuracy suitable for videoconferencing
Abstract
Nonverbal communication, in particular eye contact, is a critical element of the music classroom, shown to keep students on task, coordinate musical flow, and communicate improvisational ideas. Unfortunately, this nonverbal aspect to performance and pedagogy is lost in the virtual classroom. In this paper, we propose a machine learning system which uses single instance, single camera image frames as input to estimate the gaze target of a user seated in front of their computer, augmenting the user's video feed with a display of the estimated gaze target and thereby restoring nonverbal communication of directed gaze. The proposed estimation system consists of modular machine learning blocks, leading to a target-oriented (rather than coordinate-oriented) gaze prediction. We instantiate one such example of the complete system to run a pilot study in a virtual music classroom over Zoom…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
