GLANCE: Gaze-Led Attention Network for Compressed Edge-inference
Neeraj Solanki, Hong Ding, Sepehr Tabrizchi, Ali Shafiee Sarvestani, Shaahin Angizi, David Z. Pan, Arman Roohi

TL;DR
This paper introduces GLANCE, a gaze-led attention network that enables real-time, energy-efficient object detection for AR/VR devices by combining gaze estimation with region-focused detection, achieving high accuracy with low latency.
Contribution
The paper presents a novel memory-centric, attention-guided detection pipeline that significantly reduces computation and energy use while maintaining high detection accuracy on resource-constrained devices.
Findings
Achieves 48.1% mAP on COCO with sub-10ms latency.
Reduces computational load by 40-50% and energy consumption by 65%.
Outperforms baseline YOLOv12n in accuracy on small, medium, and large objects.
Abstract
Real-time object detection in AR/VR systems faces critical computational constraints, requiring sub-10\,ms latency within tight power budgets. Inspired by biological foveal vision, we propose a two-stage pipeline that combines differentiable weightless neural networks for ultra-efficient gaze estimation with attention-guided region-of-interest object detection. Our approach eliminates arithmetic-intensive operations by performing gaze tracking through memory lookups rather than multiply-accumulate computations, achieving an angular error of with only 393 MACs and 2.2 KiB of memory per frame. Gaze predictions guide selective object detection on attended regions, reducing computational burden by 40-50\% and energy consumption by 65\%. Deployed on the Arduino Nano 33 BLE, our system achieves 48.1\% mAP on COCO (51.8\% on attended objects) while maintaining sub-10\,ms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Advanced Neural Network Applications
