sharpDARTS: Faster and More Accurate Differentiable Architecture Search
Andrew Hundt, Varun Jain, Gregory D. Hager

TL;DR
sharpDARTS introduces improvements to neural architecture search, making it faster and more accurate by refining search space design, optimization strategies, and regularization techniques, achieving state-of-the-art results on CIFAR-10 and ImageNet.
Contribution
The paper presents sharpDARTS, a novel NAS method that enhances DARTS with better search space design, optimization, and regularization, leading to faster search and improved accuracy.
Findings
sharpDARTS is 50% faster than DARTS.
Achieves 1.93% error on CIFAR-10, state-of-the-art for similar size models.
Proposes Max-W regularization to improve generalization.
Abstract
Neural Architecture Search (NAS) has been a source of dramatic improvements in neural network design, with recent results meeting or exceeding the performance of hand-tuned architectures. However, our understanding of how to represent the search space for neural net architectures and how to search that space efficiently are both still in their infancy. We have performed an in-depth analysis to identify limitations in a widely used search space and a recent architecture search method, Differentiable Architecture Search (DARTS). These findings led us to introduce novel network blocks with a more general, balanced, and consistent design; a better-optimized Cosine Power Annealing learning rate schedule; and other improvements. Our resulting sharpDARTS search is 50% faster with a 20-30% relative improvement in final model error on CIFAR-10 when compared to DARTS. Our best single model run…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsExponential Decay · Differentiable Architecture Search Max-W · Differentiable Hyperparameter Search · Differentiable Architecture Search · Cosine Power Annealing
