Adaptive Stochastic Natural Gradient Method for One-Shot Neural   Architecture Search

Youhei Akimoto; Shinichi Shirakawa; Nozomu Yoshinari; Kento Uchida,; Shota Saito; Kouhei Nishida

arXiv:1905.08537·cs.LG·May 22, 2019·51 cites

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Youhei Akimoto, Shinichi Shirakawa, Nozomu Yoshinari, Kento Uchida,, Shota Saito, Kouhei Nishida

PDF

Open Access 1 Repo

TL;DR

This paper introduces an adaptive stochastic natural gradient method for neural architecture search that is fast, robust, and broadly applicable, enabling simultaneous optimization of weights and architecture without extensive tuning.

Contribution

It develops a generic, differentiable NAS framework using stochastic relaxation and proposes a novel adaptive stochastic natural gradient method with theoretical backing.

Findings

01

Achieved near state-of-the-art performance on image classification.

02

Demonstrated robustness across different search spaces.

03

Operated efficiently with low computational budgets.

Abstract

High sensitivity of neural architecture search (NAS) methods against their input such as step-size (i.e., learning rate) and search space prevents practitioners from applying them out-of-the-box to their own problems, albeit its purpose is to automate a part of tuning process. Aiming at a fast, robust, and widely-applicable NAS, we develop a generic optimization framework for NAS. We turn a coupled optimization of connection weights and neural architecture into a differentiable optimization by means of stochastic relaxation. It accepts arbitrary search space (widely-applicable) and enables to employ a gradient-based simultaneous optimization of weights and architecture (fast). We propose a stochastic natural gradient method with an adaptive step-size mechanism built upon our theoretical investigation (robust). Despite its simplicity and no problem-dependent parameter tuning, our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shirakawas/ASNG-NAS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Neural Networks and Applications

MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory