Neural Networks for Semantic Gaze Analysis in XR Settings
Lena Stubbemann, Dominik D\"urrschnabel, Robert Refflinghaus

TL;DR
This paper introduces a CNN-based method for efficient semantic gaze analysis in XR environments, reducing annotation effort by leveraging synthetic data and object recognition techniques, applicable across VR and AR settings.
Contribution
The novel approach uses synthetic data and CNNs to minimize annotation time for 3D scene gaze analysis without relying on markers or preexisting databases.
Findings
Method competes with state-of-the-art approaches
Does not require additional markers or databases
Works across virtual and real environments
Abstract
Virtual-reality (VR) and augmented-reality (AR) technology is increasingly combined with eye-tracking. This combination broadens both fields and opens up new areas of application, in which visual perception and related cognitive processes can be studied in interactive but still well controlled settings. However, performing a semantic gaze analysis of eye-tracking data from interactive three-dimensional scenes is a resource-intense task, which so far has been an obstacle to economic use. In this paper we present a novel approach which minimizes time and information necessary to annotate volumes of interest (VOIs) by using techniques from object recognition. To do so, we train convolutional neural networks (CNNs) on synthetic data sets derived from virtual models using image augmentation techniques. We evaluate our method in real and virtual environments, showing that the method can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
