Achieving Real-Time Object Detection on MobileDevices with Neural Pruning Search
Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Bin Ren, Yanzhi Wang, Xue Lin

TL;DR
This paper introduces a compiler-aware neural pruning search framework that enables real-time 2D and 3D object detection on mobile devices, balancing accuracy and inference speed for autonomous vehicle applications.
Contribution
It presents a novel neural pruning search method that automatically optimizes pruning schemes for object detection models on mobile hardware, achieving near real-time performance.
Findings
Achieves 55ms inference for YOLOv4 on mobile devices.
Achieves 99ms inference for PointPillars on mobile devices.
Maintains accuracy with minor or no loss during pruning.
Abstract
Object detection plays an important role in self-driving cars for security development. However, mobile systems on self-driving cars with limited computation resources lead to difficulties for object detection. To facilitate this, we propose a compiler-aware neural pruning search framework to achieve high-speed inference on autonomous vehicles for 2D and 3D object detection. The framework automatically searches the pruning scheme and rate for each layer to find a best-suited pruning for optimizing detection accuracy and speed performance under compiler optimization. Our experiments demonstrate that for the first time, the proposed method achieves (close-to) real-time, 55ms and 99ms inference times for YOLOv4 based 2D object detection and PointPillars based 3D detection, respectively, on an off-the-shelf mobile phone with minor (or no) accuracy loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods
MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · Feature Pyramid Network · Grid Sensitive · (TravEL!!Guide)How Do I File a Claim with Expedia? · Softmax · Bottom-up Path Augmentation · DropBlock · 1x1 Convolution · Cosine Annealing
