YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs
Prakhar Ganesh, Yao Chen, Yin Yang, Deming Chen, Marianne Winslett

TL;DR
This paper introduces YOLO-ReT, a real-time object detection model optimized for edge GPUs, combining a novel feature interaction module and transfer learning backbone to enhance accuracy and speed on resource-constrained devices.
Contribution
The paper proposes a new edge GPU-friendly feature interaction module and a transfer learning backbone, significantly improving accuracy and efficiency of object detection models on edge devices.
Findings
YOLO-ReT achieves 68.75 mAP on Pascal VOC on Jetson Nano.
YOLO-ReT outperforms peers by 3.05 mAP and 0.91 mAP on Pascal VOC and COCO.
Enhanced YOLOv4-tiny models improve COCO mAP by over 1 point.
Abstract
Performance of object detection models has been growing rapidly on two major fronts, model accuracy and efficiency. However, in order to map deep neural network (DNN) based object detection models to edge devices, one typically needs to compress such models significantly, thus compromising the model accuracy. In this paper, we propose a novel edge GPU friendly module for multi-scale feature interaction by exploiting missing combinatorial connections between various feature scales in existing state-of-the-art methods. Additionally, we propose a novel transfer learning backbone adoption inspired by the changing translational information flow across various tasks, designed to complement our feature interaction module and together improve both accuracy as well as execution speed on various edge GPU devices available in the market. For instance, YOLO-ReT with MobileNetV2x0.75 backbone runs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · COVID-19 diagnosis using AI
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
