EyeTheia: A Lightweight and Accessible Eye-Tracking Toolbox
Stevenson Pather, Niels Martign\`ene, Arnaud Bugnet, Fouad Boutaleb, Fabien D'Hondt, Deise Santana Maia

TL;DR
EyeTheia is a lightweight, open-source deep learning toolbox for real-time webcam-based gaze estimation, suitable for browser-based experiments and clinical research, offering comparable accuracy to commercial solutions.
Contribution
It introduces a novel, accessible gaze-tracking pipeline combining MediaPipe landmarks with CNNs, and evaluates strategies for model adaptation and fine-tuning.
Findings
Comparable performance of pretrained and from-scratch models on MPIIFaceGaze
User-specific fine-tuning reduces gaze prediction error
Strong agreement with commercial gaze tracker in stimulus tasks
Abstract
We introduce EyeTheia, a lightweight and open deep learning pipeline for webcam-based gaze estimation, designed for browser-based experimental platforms and real-world cognitive and clinical research. EyeTheia enables real-time gaze tracking using only a standard laptop webcam, combining MediaPipe-based landmark extraction with a convolutional neural network inspired by iTracker and optional user-specific fine-tuning. We investigate two complementary strategies: adapting a model pretrained on mobile data and training the same architecture from scratch on a desktop-oriented dataset. Validation results on MPIIFaceGaze show comparable performance between both approaches prior to calibration, while lightweight user-specific fine-tuning consistently reduces gaze prediction error. We further evaluate EyeTheia in a realistic Dot-Probe task and compare it to the commercial webcam-based tracker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Neurobiology of Language and Bilingualism · Visual Attention and Saliency Detection
