Scaling Up Deep Neural Network Optimization for Edge Inference
Bingqian Lu, Jianyi Yang, and Shaolei Ren

TL;DR
This paper introduces scalable methods for optimizing deep neural networks for diverse edge devices, enabling efficient DNN inference on mobile and embedded hardware through predictor reuse and neural network-based optimization.
Contribution
The authors propose two novel approaches: reusing performance predictors across devices using monotonicity, and building scalable predictors with a neural optimizer for direct DNN design output.
Findings
Reusing predictors reduces optimization time across multiple devices.
Neural network-based optimizer directly outputs optimal DNN configurations.
Scalable predictors accurately estimate performance metrics for various device-DNN pairs.
Abstract
Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory performance, optimizing the DNN design (e.g., network architecture and quantization policy) is crucial. While state-of-the-art DNN designs have leveraged performance predictors to speed up the optimization process, they are device-specific (i.e., each predictor for only one target device) and hence cannot scale well in the presence of extremely diverse edge devices. Moreover, even with performance predictors, the optimizer (e.g., search-based optimization) can still be time-consuming when optimizing DNNs for many different devices. In this work, we propose two approaches to scaling up DNN optimization. In the first approach, we reuse the performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
