Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis
Sourabh Vora, Akshay Rangesh, and Mohan M. Trivedi

TL;DR
This paper presents a generalized CNN-based framework for driver gaze zone estimation that is invariant to subjects, perspectives, and scales, achieving high accuracy and demonstrating strong cross-dataset generalization.
Contribution
The study fine-tunes multiple CNN architectures for driver gaze estimation, compares their performance, and evaluates generalization on diverse datasets, advancing beyond personalized systems.
Findings
Achieved 95.18% accuracy in cross-subject testing
Outperformed existing state-of-the-art methods
Demonstrated good generalization on the Columbia Gaze Dataset
Abstract
Driver gaze has been shown to be an excellent surrogate for driver attention in intelligent vehicles. With the recent surge of highly autonomous vehicles, driver gaze can be useful for determining the handoff time to a human driver. While there has been significant improvement in personalized driver gaze zone estimation systems, a generalized system which is invariant to different subjects, perspectives and scales is still lacking. We take a step towards this generalized system using Convolutional Neural Networks (CNNs). We finetune 4 popular CNN architectures for this task, and provide extensive comparisons of their outputs. We additionally experiment with different input image patches, and also examine how image size affects performance. For training and testing the networks, we collect a large naturalistic driving dataset comprising of 11 long drives, driven by 10 subjects in two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
