Performance of object recognition in wearable videos
Alberto Sabater, Luis Montesano, Ana C. Murillo

TL;DR
This paper evaluates the performance of the YOLO object detection architecture on wearable videos, addressing challenges like low quality and clutter, and provides insights for future improvements in this domain.
Contribution
It offers a comprehensive analysis of YOLO's effectiveness on wearable videos, including various modifications and training strategies, to guide future research.
Findings
YOLO performs well with a good accuracy-speed trade-off in wearable videos.
Certain architectural variations improve detection in cluttered, low-quality wearable footage.
The study identifies promising directions for enhancing object detection in challenging wearable video scenarios.
Abstract
Wearable technologies are enabling plenty of new applications of computer vision, from life logging to health assistance. Many of them are required to recognize the elements of interest in the scene captured by the camera. This work studies the problem of object detection and localization on videos captured by this type of camera. Wearable videos are a much more challenging scenario for object detection than standard images or even another type of videos, due to lower quality images (e.g. poor focus) or high clutter and occlusion common in wearable recordings. Existing work typically focuses on detecting the objects of focus or those being manipulated by the user wearing the camera. We perform a more general evaluation of the task of object detection in this type of video, because numerous applications, such as marketing studies, also need detecting objects which are not in focus by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsYou Only Look Once
