From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
Xue Geng, Zhe Wang, Chunyun Chen, Qing Xu, Kaixin Xu, Chao Jin, Manas, Gupta, Xulei Yang, Zhenghua Chen, Mohamed M. Sabry Aly, Jie Lin, Min Wu,, Xiaoli Li

TL;DR
This survey comprehensively reviews recent advances in efficient, secure, and hardware-accelerated deployment of deep neural networks, covering model compression, hardware design, and security techniques.
Contribution
It provides an integrated overview of recent methods in model compression, hardware acceleration, and security for DNN deployment, highlighting their interconnections and challenges.
Findings
Model compression techniques significantly reduce DNN resource requirements.
Hardware accelerators are increasingly tailored to compressed models.
Homomorphic encryption enhances DNN security during deployment.
Abstract
Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compression methods to achieve model efficiency while retaining the performance. Furthermore, more and more works focus on customizing the DNN hardware accelerators to better leverage the model compression techniques. In addition to efficiency, preserving security and privacy is critical for deploying DNNs. However, the vast and diverse body of related works can be overwhelming. This inspires us to conduct a comprehensive survey on recent research toward the goal of high-performance, cost-efficient,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Anomaly Detection Techniques and Applications
MethodsFocus
