xYOLO: A Model For Real-Time Object Detection In Humanoid Soccer On Low-End Hardware
Daniel Barry, Munir Shah, Merel Keijsers, Humayun Khan and, Banon Hopman

TL;DR
This paper introduces xYOLO, an adapted CNN model that significantly improves real-time object detection speed on low-end hardware like Raspberry Pi 3 B, enabling humanoid soccer robots to detect goals and balls efficiently.
Contribution
The paper presents xYOLO, a novel adaptation of YOLO that achieves over 9 FPS on Raspberry Pi 3 B, about 70 times faster than Tiny-YOLO, with an annotated dataset for goal and ball detection.
Findings
xYOLO achieves 9.66 FPS on Raspberry Pi 3 B.
xYOLO is approximately 70 times faster than Tiny-YOLO.
The model maintains acceptable accuracy for soccer object detection.
Abstract
With the emergence of onboard vision processing for areas such as the internet of things (IoT), edge computing and autonomous robots, there is increasing demand for computationally efficient convolutional neural network (CNN) models to perform real-time object detection on resource constraints hardware devices. Tiny-YOLO is generally considered as one of the faster object detectors for low-end devices and is the basis for our work. Our experiments on this network have shown that Tiny-YOLO can achieve 0.14 frames per second(FPS) on the Raspberry Pi 3 B, which is too slow for soccer playing autonomous humanoid robots detecting goal and ball objects. In this paper we propose an adaptation to the YOLO CNN model named xYOLO, that can achieve object detection at a speed of 9.66 FPS on the Raspberry Pi 3 B. This is achieved by trading an acceptable amount of accuracy, making the network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
