Neural Architecture Search for Energy Efficient Always-on Audio Models
Daniel T. Speckhard, Karolis Misiunas, Sagi Perel, Tenghui Zhu, Simon, Carlile, Malcolm Slaney

TL;DR
This paper introduces a neural architecture search method optimized for energy efficiency and accuracy on mobile audio classification tasks, achieving significant reductions in energy and memory usage.
Contribution
It presents a novel NAS approach combining Bayesian and evolutionary strategies with early-stopping, tailored for energy-efficient audio models on real hardware.
Findings
Achieved an order of magnitude reduction in energy per inference.
Produced models with smaller memory footprints than MobileNet variants.
Slightly improved accuracy on sound-event classification.
Abstract
Mobile and edge computing devices for always-on classification tasks require energy-efficient neural network architectures. In this paper we present several changes to neural architecture searches (NAS) that improve the chance of success in practical situations. Our search simultaneously optimizes for network accuracy, energy efficiency and memory usage. We benchmark the performance of our search on real hardware, but since running thousands of tests with real hardware is difficult we use a random forest model to roughly predict the energy usage of a candidate network. We present a search strategy that uses both Bayesian and regularized evolutionary search with particle swarms, and employs early-stopping to reduce the computational burden. Our search, evaluated on a sound-event classification dataset based upon AudioSet, results in an order of magnitude less energy per inference and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Neural Networks and Applications · Speech and Audio Processing
