DHECA-SuperGaze: Dual Head-Eye Cross-Attention and Super-Resolution for Unconstrained Gaze Estimation

Franko \v{S}iki\'c; Donik Vr\v{s}nak; Sven Lon\v{c}ari\'c

arXiv:2505.08426·cs.CV·March 6, 2026

DHECA-SuperGaze: Dual Head-Eye Cross-Attention and Super-Resolution for Unconstrained Gaze Estimation

Franko \v{S}iki\'c, Donik Vr\v{s}nak, Sven Lon\v{c}ari\'c

PDF

TL;DR

DHECA-SuperGaze is a novel deep learning approach that combines super-resolution and dual head-eye cross-attention to improve unconstrained gaze estimation accuracy in real-world scenarios.

Contribution

The paper introduces DHECA-SuperGaze, a new method integrating super-resolution and cross-attention modules for enhanced gaze prediction and corrects dataset annotation errors for better evaluation.

Findings

01

Reduces angular error by up to 0.59° in static settings.

02

Achieves over 1.53° improvement in cross-dataset tests.

03

Demonstrates superior performance on Gaze360 and GFIE datasets.

Abstract

Unconstrained gaze estimation is the process of determining where a subject is directing their visual attention in uncontrolled environments. Gaze estimation systems are important for a myriad of tasks such as driver distraction monitoring, exam proctoring, accessibility features in modern software, etc. However, these systems face challenges in real-world scenarios, partially due to the low resolution of in-the-wild images and partially due to insufficient modeling of head-eye interactions in current state-of-the-art (SOTA) methods. This paper introduces DHECA-SuperGaze, a deep learning-based method that advances gaze prediction through super-resolution (SR) and a dual head-eye cross-attention (DHECA) module. Our dual-branch convolutional backbone processes eye and multiscale SR head images, while the proposed DHECA module enables bidirectional feature refinement between the extracted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Autoencoders