HQNAS: Auto CNN deployment framework for joint quantization and architecture search
Hongjiang Chen, Yang Wang, Leibo Liu, Shaojun Wei, Shouyi Yin

TL;DR
HQNAS is a novel framework that efficiently combines neural architecture search and quantization to optimize neural networks for edge devices, significantly reducing search time and latency while maintaining accuracy.
Contribution
The paper introduces HQNAS, a joint NAS and quantization framework using weight-sharing and bit-sharing, achieving faster search and better deployment performance on embedded systems.
Findings
Discoveries in 4 GPU hours on CIFAR10
Comparable models on ImageNet with 10% GPU time
Latency decreased by 1.8x with minimal accuracy loss
Abstract
Deep learning applications are being transferred from the cloud to edge with the rapid development of embedded computing systems. In order to achieve higher energy efficiency with the limited resource budget, neural networks(NNs) must be carefully designed in two steps, the architecture design and the quantization policy choice. Neural Architecture Search(NAS) and Quantization have been proposed separately when deploying NNs onto embedded devices. However, taking the two steps individually is time-consuming and leads to a sub-optimal final deployment. To this end, we propose a novel neural network design framework called Hardware-aware Quantized Neural Architecture Search(HQNAS) framework which combines the NAS and Quantization together in a very efficient manner using weight-sharing and bit-sharing. It takes only 4 GPU hours to discover an outstanding NN policy on CIFAR10. It also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Image and Video Retrieval Techniques
