Gaze4HRI: Zero-shot Benchmarking Gaze Estimation Neural-Networks for Human-Robot Interaction

Berk Sezer; Ali G\"orkem K\"u\c{c}\"uk; Erol \c{S}ahin; Sinan Kalkan

arXiv:2605.04770·cs.CV·May 7, 2026

Gaze4HRI: Zero-shot Benchmarking Gaze Estimation Neural-Networks for Human-Robot Interaction

Berk Sezer, Ali G\"orkem K\"u\c{c}\"uk, Erol \c{S}ahin, Sinan Kalkan

PDF

1 Repo

TL;DR

Gaze4HRI introduces a comprehensive benchmark and dataset for zero-shot 3D gaze estimation in human-robot interaction, revealing the importance of data diversity over complex modeling.

Contribution

The paper presents a large-scale HRI-specific gaze dataset and benchmark, highlighting data diversity as key to robustness over complex models.

Findings

01

All evaluated methods fail in at least one HRI condition.

02

PureGaze trained on ETH-X-Gaze maintains robustness across conditions.

03

Data diversity outweighs complex modeling for zero-shot gaze estimation.

Abstract

While zero-shot appearance-based 3D gaze estimation offers significant cost-efficiency by directly mapping RGB images to gaze vectors, its reliability in Human-Robot Interaction (HRI) settings remains uncertain. Existing benchmarks frequently overlook fundamental HRI conditions, such as dynamic camera viewpoints and moving targets in video. Furthermore, current cross-dataset evaluations often suffer from a complexity gap, where methods trained on diverse datasets are tested on significantly smaller and less varied sets, failing to assess true robustness. To bridge these gaps, we introduce Gaze4HRI, a large-scale dataset (50+ subjects, 3,000+ videos, 600,000+ frames) designed to evaluate state-of-the-art performance against critical HRI variables: illumination, head-gaze conflict, as well as the motion of camera and gaze target in video. Our benchmark reveals that all evaluated methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gazeforhri.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.