Real Eyes Realize Faster: Gaze Stability and Pupil Novelty for Efficient Egocentric Learning
Ajan Subramanian, Sumukh Bettadapura, Rohan Sathish

TL;DR
This paper introduces a dual-criterion frame curation method using gaze stability and pupil novelty from eye-tracking data to efficiently select high-quality, informative frames in egocentric videos, reducing redundancy and improving task performance.
Contribution
It presents a novel, inference-free, real-time frame selection approach leveraging eye-tracking signals for egocentric video summarization and task-specific data curation.
Findings
Curated frames at 10% budget match full stream classification performance.
Naive signal fusion reduces effectiveness of gaze and pupil signals.
Pupil ranking enhances activity recognition, gaze dominates scene recognition.
Abstract
Always-on egocentric cameras are increasingly used as demonstrations for embodied robotics, imitation learning, and assistive AR, but the resulting video streams are dominated by redundant and low-quality frames. Under the storage and battery constraints of wearable devices, choosing which frames to keep is as important as how to learn from them. We observe that modern eye-tracking headsets provide a continuous, training-free side channel that decomposes into two complementary axes: gaze fixation captures visual stability (quality), while pupil response captures arousal-linked moments (novelty). We operationalize this insight as a Dual-Criterion Frame Curator that first gates frames by gaze quality and then ranks the survivors by pupil-derived novelty. On the Visual Experience Dataset (VEDB), curated frames at 10% budget match the classification performance of the full stream, and naive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · EEG and Brain-Computer Interfaces
