TL;DR
This paper introduces a differentiable neural architecture search method that simultaneously optimizes for accuracy, energy, and memory constraints, producing efficient models suitable for edge devices.
Contribution
It presents the first DNAS approach that co-optimizes accuracy, energy, and memory constraints, addressing a realistic deployment scenario for edge AI.
Findings
Generated architectures form rich Pareto fronts in energy vs. accuracy space.
Achieved up to 2.2x energy reduction with negligible accuracy loss.
Models meet diverse memory constraints from 75% to 6.25% of baseline.
Abstract
Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer's perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGumbel Softmax · Differentiable Neural Architecture Search · Differentiable Neural Architecture Search
