Is Geometry Enough? An Evaluation of Landmark-Based Gaze Estimation

Daniele Agostinelli; Thomas Agostinelli; Andrea Generosi; Maura Mengoni

arXiv:2603.24724·cs.CV·March 27, 2026

Is Geometry Enough? An Evaluation of Landmark-Based Gaze Estimation

Daniele Agostinelli, Thomas Agostinelli, Andrea Generosi, Maura Mengoni

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of landmark-based geometric methods for gaze estimation, comparing them with deep learning models across multiple datasets, and finds that geometric features can be robust and efficient for practical applications.

Contribution

It provides a comprehensive benchmark of landmark-based gaze estimation, introduces a standardized pipeline, and demonstrates the potential of lightweight models for generalization.

Findings

01

Landmark-based models perform worse within domains due to noise.

02

MLP architectures generalize well across domains, comparable to ResNet18.

03

Sparse geometric features are sufficient for robust gaze estimation.

Abstract

Appearance-based gaze estimation frequently relies on deep Convolutional Neural Networks (CNNs). These models are accurate, but computationally expensive and act as "black boxes", offering little interpretability. Geometric methods based on facial landmarks are a lightweight alternative, but their performance limits and generalization capabilities remain underexplored in modern benchmarks. In this study, we conduct a comprehensive evaluation of landmark-based gaze estimation. We introduce a standardized pipeline to extract and normalize landmarks from three large-scale datasets (Gaze360, ETH-XGaze, and GazeGene) and train lightweight regression models, specifically Extreme Gradient Boosted trees and two neural architectures: a holistic Multi-Layer Perceptron (MLP) and a siamese MLP designed to capture binocular geometry. We find that landmark-based models exhibit lower performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Face recognition and analysis · Visual Attention and Saliency Detection