U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search
Ahmet Caner Y\"uz\"ug\"uler, Nikolaos Dimitriadis, Pascal Frossard

TL;DR
This paper introduces U-Boost NAS, a novel neural architecture search framework that optimizes for resource utilization, accuracy, and latency, significantly improving inference speed on accelerators like Google TPU.
Contribution
It presents a new hardware-aware NAS framework that incorporates resource utilization modeling, enabling more efficient DNN inference on array-based accelerators.
Findings
Achieves 2.8-4x speedup in DNN inference
Maintains similar or better accuracy on CIFAR-10 and Imagenet-100
Validates a new resource utilization model for inference accelerators
Abstract
Optimizing resource utilization in target platforms is key to achieving high performance during DNN inference. While optimizations have been proposed for inference latency, memory footprint, and energy consumption, prior hardware-aware neural architecture search (NAS) methods have omitted resource utilization, preventing DNNs to take full advantage of the target inference platforms. Modeling resource utilization efficiently and accurately is challenging, especially for widely-used array-based inference accelerators such as Google TPU. In this work, we propose a novel hardware-aware NAS framework that does not only optimize for task accuracy and inference latency, but also for resource utilization. We also propose and validate a new computational model for resource utilization in inference accelerators. By using the proposed NAS framework and the proposed resource utilization model, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Machine Learning and Data Classification
