Finding Competitive Network Architectures Within a Day Using UCT
Martin Wistuba

TL;DR
This paper introduces a Monte Carlo planning approach using UCT for neural network architecture search, achieving competitive results within a day of GPU time, significantly reducing search costs compared to prior methods.
Contribution
The paper adapts UCT-based Monte Carlo planning for efficient neural architecture search, enabling competitive results within a single GPU day.
Findings
Found competitive architectures for MNIST, SVHN, CIFAR-10 in one GPU day.
Outperformed human-designed and other automated architectures with extended search.
Demonstrated the efficiency of UCT-based search in neural architecture optimization.
Abstract
The design of neural network architectures for a new data set is a laborious task which requires human deep learning expertise. In order to make deep learning available for a broader audience, automated methods for finding a neural network architecture are vital. Recently proposed methods can already achieve human expert level performances. However, these methods have run times of months or even years of GPU computing time, ignoring hardware constraints as faced by many researchers and companies. We propose the use of Monte Carlo planning in combination with two different UCT (upper confidence bound applied to trees) derivations to search for network architectures. We adapt the UCT algorithm to the needs of network architecture search by proposing two ways of sharing information between different branches of the search tree. In an empirical study we are able to demonstrate that this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Graph Theory and Algorithms · Machine Learning and Algorithms
