Pelee: A Real-Time Object Detection System on Mobile Devices
Robert J. Wang, Xiang Li, Charles X. Ling

TL;DR
Pelee is a real-time object detection system optimized for mobile devices, combining an efficient CNN architecture called PeleeNet with SSD, achieving high accuracy and speed with lower computational cost and smaller model size.
Contribution
The paper introduces PeleeNet, a convolution-based architecture for mobile CNNs, and a real-time detection system Pelee that outperforms existing models in speed, accuracy, and size.
Findings
PeleeNet achieves higher accuracy and faster speed than MobileNet on ImageNet.
Pelee system attains 76.4% mAP on PASCAL VOC2007 at 23.6 FPS on iPhone 8.
Pelee outperforms YOLOv2 with higher precision and lower computational cost.
Abstract
An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and MobileNetV2. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy and over 1.8 times faster speed than MobileNet and MobileNetV2 on NVIDIA TX2. Meanwhile, PeleeNet is only 66% of the model size of MobileNet. We then propose a real-time object detection system by combining PeleeNet with Single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
