Latency-Aware Differentiable Neural Architecture Search
Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian,, Hongkai Xiong

TL;DR
This paper introduces a latency-aware differentiable neural architecture search method that incorporates latency prediction into the optimization process, enabling the design of neural networks that balance accuracy and hardware efficiency.
Contribution
It proposes a novel approach to include latency as a differentiable loss term in neural architecture search, improving hardware friendliness of the resulting networks.
Findings
Latency prediction achieves less than 10% relative error with 100K samples.
The method reduces network latency by 20% while maintaining accuracy.
The approach is adaptable to various hardware platforms and non-differentiable factors.
Abstract
Differentiable neural architecture search methods became popular in recent years, mainly due to their low search costs and flexibility in designing the search space. However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware. This paper deals with this problem by adding a differentiable latency loss term into optimization, so that the search process can tradeoff between accuracy and latency with a balancing coefficient. The core of latency prediction is to encode each network architecture and feed it into a multi-layer regressor, with the training data which can be easily collected from randomly sampling a number of architectures and evaluating them on the hardware. We evaluate our approach on NVIDIA Tesla-P100 GPUs. With 100K sampled architectures (requiring a few hours), the latency prediction module arrives at a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
