xYOLO: A Model For Real-Time Object Detection In Humanoid Soccer On   Low-End Hardware

Daniel Barry; Munir Shah; Merel Keijsers; Humayun Khan and; Banon Hopman

arXiv:1910.03159·cs.CV·October 9, 2019

xYOLO: A Model For Real-Time Object Detection In Humanoid Soccer On Low-End Hardware

Daniel Barry, Munir Shah, Merel Keijsers, Humayun Khan and, Banon Hopman

PDF

TL;DR

This paper introduces xYOLO, an adapted CNN model that significantly improves real-time object detection speed on low-end hardware like Raspberry Pi 3 B, enabling humanoid soccer robots to detect goals and balls efficiently.

Contribution

The paper presents xYOLO, a novel adaptation of YOLO that achieves over 9 FPS on Raspberry Pi 3 B, about 70 times faster than Tiny-YOLO, with an annotated dataset for goal and ball detection.

Findings

01

xYOLO achieves 9.66 FPS on Raspberry Pi 3 B.

02

xYOLO is approximately 70 times faster than Tiny-YOLO.

03

The model maintains acceptable accuracy for soccer object detection.

Abstract

With the emergence of onboard vision processing for areas such as the internet of things (IoT), edge computing and autonomous robots, there is increasing demand for computationally efficient convolutional neural network (CNN) models to perform real-time object detection on resource constraints hardware devices. Tiny-YOLO is generally considered as one of the faster object detectors for low-end devices and is the basis for our work. Our experiments on this network have shown that Tiny-YOLO can achieve 0.14 frames per second(FPS) on the Raspberry Pi 3 B, which is too slow for soccer playing autonomous humanoid robots detecting goal and ball objects. In this paper we propose an adaptation to the YOLO CNN model named xYOLO, that can achieve object detection at a speed of 9.66 FPS on the Raspberry Pi 3 B. This is achieved by trading an acceptable amount of accuracy, making the network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings