Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices
Yiwei Zhao, Ziyun Li, Win-San Khwa, Xiaoyu Sun, Sai Qian Zhang, Syed, Shakib Sarwar, Kleber Hugo Stangherlin, Yi-Lun Lu, Jorge Tomas Gomez, Jae-Sun, Seo, Phillip B. Gibbons, Barbara De Salvo, Chiao Liu

TL;DR
This paper introduces H4H-NAS, a neural architecture search framework that designs efficient hybrid CNN/ViT models optimized for NPU-CIM heterogeneous edge devices, significantly improving latency and energy efficiency for AR/VR applications.
Contribution
It presents a novel NAS framework leveraging real silicon and industry IP performance data to optimize hybrid models for NPU-CIM systems, addressing latency and energy challenges.
Findings
Achieves up to 1.34% accuracy improvement on ImageNet.
Reduces latency by up to 56.08%.
Reduces energy consumption by up to 41.72%.
Abstract
Low-Latency and Low-Power Edge AI is essential for Virtual Reality and Augmented Reality applications. Recent advances show that hybrid models, combining convolution layers (CNN) and transformers (ViT), often achieve superior accuracy/performance tradeoff on various computer vision and machine learning (ML) tasks. However, hybrid ML models can pose system challenges for latency and energy-efficiency due to their diverse nature in dataflow and memory access patterns. In this work, we leverage the architecture heterogeneity from Neural Processing Units (NPU) and Compute-In-Memory (CIM) and perform diverse execution schemas to efficiently execute these hybrid models. We also introduce H4H-NAS, a Neural Architecture Search framework to design efficient hybrid CNN/ViT models for heterogeneous edge systems with both NPU and CIM. Our H4H-NAS approach is powered by a performance estimator built…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Computer Graphics and Visualization Techniques · Neural Networks and Applications
MethodsConvolution
