Scene Text Detection and Recognition "in light of" Challenging Environmental Conditions using Aria Glasses Egocentric Vision Cameras
Joseph De Mathia, Carlos Francisco Moreno-Garc\'ia

TL;DR
This paper evaluates scene text detection and recognition using Meta's Aria glasses under challenging environmental conditions, highlighting the impact of resolution, distance, and lighting, and proposing eye-gaze integration for efficiency.
Contribution
Introduces a new dataset and benchmarks STDR algorithms in egocentric AR scenarios, demonstrating the effects of environmental factors and the benefits of image upscaling and gaze-based focus.
Findings
Resolution and distance significantly affect recognition accuracy.
Upscaling reduces Character Error Rate from 0.65 to 0.48.
Gaze tracking can optimize processing efficiency.
Abstract
In an era where wearable technology is reshaping applications, Scene Text Detection and Recognition (STDR) becomes a straightforward choice through the lens of egocentric vision. Leveraging Meta's Project Aria smart glasses, this paper investigates how environmental variables, such as lighting, distance, and resolution, affect the performance of state-of-the-art STDR algorithms in real-world scenarios. We introduce a novel, custom-built dataset captured under controlled conditions and evaluate two OCR pipelines: EAST with CRNN, and EAST with PyTesseract. Our findings reveal that resolution and distance significantly influence recognition accuracy, while lighting plays a less predictable role. Notably, image upscaling emerged as a key pre-processing technique, reducing Character Error Rate (CER) from 0.65 to 0.48. We further demonstrate the potential of integrating eye-gaze tracking to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
