ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Han Cai, Ligeng Zhu, Song Han

TL;DR
ProxylessNAS enables direct neural architecture search on large-scale tasks and hardware, reducing computational costs and memory usage, leading to more effective and specialized neural network architectures.
Contribution
It introduces a memory-efficient differentiable NAS method that directly optimizes architectures for target tasks and hardware without proxy tasks.
Findings
Achieves state-of-the-art accuracy on CIFAR-10 with fewer parameters.
Outperforms MobileNetV2 on ImageNet in accuracy and speed.
Reduces GPU memory and computational costs to regular training levels.
Abstract
Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on the target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsDifferentiable Neural Architecture Search · Average Pooling · Tether Customer Service Number +1-833-534-1729 · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · 1x1 Convolution · Batch Normalization · DropPath · Global Average Pooling
