Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Bert Moons, Parham Noorzad, Andrii Skliar, Giovanni Mariani, Dushyant, Mehta, Chris Lott, Tijmen Blankevoort

TL;DR
DONNA is a scalable, rapid neural architecture search pipeline that efficiently finds diverse, hardware-aware neural network models using a knowledge distillation-based accuracy predictor and evolutionary search, outperforming existing methods in speed and efficiency.
Contribution
We introduce DONNA, a novel NAS framework that combines knowledge distillation, evolutionary search, and rapid finetuning to enable scalable, diverse, and hardware-aware neural network design.
Findings
DONNA is up to 100x faster than MNasNet in architecture search.
DONNA architectures are 20% faster than EfficientNet-B0 on Nvidia V100.
DONNA achieves 10% faster inference with slightly higher accuracy than MobileNetV2-1.4x on a smartphone.
Abstract
Current state-of-the-art Neural Architecture Search (NAS) methods neither efficiently scale to multiple hardware platforms, nor handle diverse architectural search-spaces. To remedy this, we present DONNA (Distilling Optimal Neural Network Architectures), a novel pipeline for rapid, scalable and diverse NAS, that scales to many user scenarios. DONNA consists of three phases. First, an accuracy predictor is built using blockwise knowledge distillation from a reference model. This predictor enables searching across diverse networks with varying macro-architectural parameters such as layer types and attention mechanisms, as well as across micro-architectural parameters such as block repeats and expansion rates. Second, a rapid evolutionary search finds a set of pareto-optimal architectures for any scenario using the accuracy predictor and on-device measurements. Third, optimal models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Machine Learning and ELM
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Knowledge Distillation · Softmax · Sigmoid Activation · Dropout · Dense Connections · Squeeze-and-Excitation Block · Global Average Pooling · MnasNet · Pointwise Convolution
