NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models
Yang Xu, Huihong Shi, and Zhongfeng Wang

TL;DR
NASH introduces a combined neural architecture and accelerator search framework for multiplication-reduced hybrid models, significantly improving hardware efficiency and throughput while maintaining accuracy on edge device neural network deployment.
Contribution
It proposes a zero-shot metric for efficient hybrid model search and a coarse-to-fine accelerator search, integrating both to optimize model and hardware pairing.
Findings
Achieves 2.14x throughput and 2.01x FPS with 0.25% accuracy gain on CIFAR-100.
Achieves 1.40x throughput and 1.19x FPS with 0.56% accuracy gain on Tiny-ImageNet.
Outperforms state-of-the-art multiplication-based systems in efficiency and speed.
Abstract
The significant computational cost of multiplications hinders the deployment of deep neural networks (DNNs) on edge devices. While multiplication-free models offer enhanced hardware efficiency, they typically sacrifice accuracy. As a solution, multiplication-reduced hybrid models have emerged to combine the benefits of both approaches. Particularly, prior works, i.e., NASA and NASA-F, leverage Neural Architecture Search (NAS) to construct such hybrid models, enhancing hardware efficiency while maintaining accuracy. However, they either entail costly retraining or encounter gradient conflicts, limiting both search efficiency and accuracy. Additionally, they overlook the acceleration opportunity introduced by accelerator search, yielding sub-optimal hardware performance. To overcome these limitations, we propose NASH, a Neural architecture and Accelerator Search framework for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks
