Merging Multiple Datasets for Improved Appearance-Based Gaze Estimation

Liang Wu; Bertram E. Shi

arXiv:2409.00912·cs.CV·September 4, 2024

Merging Multiple Datasets for Improved Appearance-Based Gaze Estimation

Liang Wu, Bertram E. Shi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel transformer-based architecture and a gaze adaptation module to effectively combine multiple datasets for appearance-based gaze estimation, overcoming protocol and label inconsistencies to improve accuracy.

Contribution

It proposes a two-stage transformer fusion method and a dataset-specific gaze adaptation module to enhance multi-dataset gaze estimation performance.

Findings

01

Improved gaze estimation accuracy by 10-20% over state-of-the-art methods.

02

Effective handling of dataset protocol and label inconsistencies.

03

Demonstrated benefits of the proposed architecture through extensive experiments.

Abstract

Multiple datasets have been created for training and testing appearance-based gaze estimators. Intuitively, more data should lead to better performance. However, combining datasets to train a single esti-mator rarely improves gaze estimation performance. One reason may be differences in the experimental protocols used to obtain the gaze sam-ples, resulting in differences in the distributions of head poses, gaze an-gles, illumination, etc. Another reason may be the inconsistency between methods used to define gaze angles (label mismatch). We propose two innovations to improve the performance of gaze estimation by leveraging multiple datasets, a change in the estimator architecture and the intro-duction of a gaze adaptation module. Most state-of-the-art estimators merge information extracted from images of the two eyes and the entire face either in parallel or combine information from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkust-nisl/gazesetmerge
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Video Surveillance and Tracking Methods · Hand Gesture Recognition Systems