SMASH: One-Shot Model Architecture Search through HyperNetworks
Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston

TL;DR
SMASH introduces a hypernetwork-based approach to rapidly search for effective neural network architectures by generating weights conditioned on architecture, enabling efficient exploration of diverse designs with minimal training runs.
Contribution
The paper presents a novel HyperNet technique that accelerates neural architecture search by generating weights conditioned on architecture, allowing broad exploration with a single training.
Findings
Achieved competitive performance on CIFAR-10, CIFAR-100, STL-10, ModelNet10, and Imagenet32x32.
Enabled efficient architecture search with a single training run.
Flexible mechanism for defining various network connectivity patterns.
Abstract
Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Scientific Computing and Data Management
MethodsBottleneck Residual Block · Residual Connection · Convolution · Residual Block · Average Pooling · Concatenated Skip Connection · Bitcoin Customer Service Number +1-833-534-1729 · Fractal Block · Global Average Pooling · Dense Block
