FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining
Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen, Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez

TL;DR
This paper introduces NARS, a method that jointly searches for neural network architectures and their training recipes, significantly improving efficiency and performance of compact models like FBNetV3 across multiple tasks.
Contribution
The paper proposes a novel joint architecture-recipe search method with predictor pretraining, enabling fast, resource-efficient generation of high-performing neural networks.
Findings
FBNetV3 outperforms manually-designed models on ImageNet with fewer FLOPs.
NARS achieves high prediction accuracy and sample efficiency through predictor pretraining.
FBNetV3 improves object detection performance with fewer parameters and FLOPs.
Abstract
Neural Architecture Search (NAS) yields state-of-the-art neural networks that outperform their best manually-designed counterparts. However, previous NAS methods search for architectures under one set of training hyper-parameters (i.e., a training recipe), overlooking superior architecture-recipe combinations. To address this, we present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously. NARS utilizes an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. Furthermore, to compensate for the enlarged search space, we leverage "free" architecture statistics (e.g., FLOP count) to pretrain the predictor, significantly improving its sample efficiency and prediction reliability. After training the predictor via constrained iterative optimization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
