Neural Architecture Search by Estimation of Network Structure   Distributions

Anton Muravev; Jenni Raitoharju; Moncef Gabbouj

arXiv:1908.06886·cs.NE·January 29, 2021

Neural Architecture Search by Estimation of Network Structure Distributions

Anton Muravev, Jenni Raitoharju, Moncef Gabbouj

PDF

TL;DR

This paper introduces a probabilistic approach to neural architecture search that models entire network structures, enabling the discovery of irregular, high-performance architectures beyond traditional block-based methods.

Contribution

It proposes a novel probabilistic representation for neural networks and an estimation of distribution algorithm to efficiently explore diverse, irregular architectures.

Findings

01

Discovered non-regular architectures with competitive accuracy.

02

Achieved efficient search without complex dataflows or training techniques.

03

Demonstrated interpretability and extensibility of the probabilistic model.

Abstract

The influence of deep learning is continuously expanding across different domains, and its new applications are ubiquitous. The question of neural network design thus increases in importance, as traditional empirical approaches are reaching their limits. Manual design of network architectures from scratch relies heavily on trial and error, while using existing pretrained models can introduce redundancies or vulnerabilities. Automated neural architecture design is able to overcome these problems, but the most successful algorithms operate on significantly constrained design spaces, assuming the target network to consist of identical repeating blocks. While such approach allows for faster search, it does so at the cost of expressivity. We instead propose an alternative probabilistic representation of a whole neural network structure under the assumption of independence between layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.