ProxylessNAS: Direct Neural Architecture Search on Target Task and   Hardware

Han Cai; Ligeng Zhu; Song Han

arXiv:1812.00332·cs.LG·February 26, 2019·285 cites

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

Han Cai, Ligeng Zhu, Song Han

PDF

Open Access 5 Repos 1 Models

TL;DR

ProxylessNAS enables direct neural architecture search on large-scale tasks and hardware, reducing computational costs and memory usage, leading to more effective and specialized neural network architectures.

Contribution

It introduces a memory-efficient differentiable NAS method that directly optimizes architectures for target tasks and hardware without proxy tasks.

Findings

01

Achieves state-of-the-art accuracy on CIFAR-10 with fewer parameters.

02

Outperforms MobileNetV2 on ImageNet in accuracy and speed.

03

Reduces GPU memory and computational costs to regular training levels.

Abstract

Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. $1 0^{4}$ GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on the target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
STMicroelectronics/proxylessnas_pt
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

MethodsDifferentiable Neural Architecture Search · Average Pooling · Tether Customer Service Number +1-833-534-1729 · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · 1x1 Convolution · Batch Normalization · DropPath · Global Average Pooling