Loss Functions for Predictor-based Neural Architecture Search
Han Ji, Yuqi Feng, Jiahao Fan, Yanan Sun

TL;DR
This paper systematically investigates various loss functions for performance predictors in neural architecture search, revealing how different categories impact predictor effectiveness and providing guidance for their selection.
Contribution
It offers the first comprehensive analysis of regression, ranking, and weighted loss functions in NAS predictors, including empirical evaluation across multiple tasks and search spaces.
Findings
Certain loss function combinations improve predictor accuracy.
Ranking-based losses outperform traditional regression in some scenarios.
Guidelines for selecting loss functions in NAS are provided.
Abstract
Evaluation is a critical but costly procedure in neural architecture search (NAS). Performance predictors have been widely adopted to reduce evaluation costs by directly estimating architecture performance. The effectiveness of predictors is heavily influenced by the choice of loss functions. While traditional predictors employ regression loss functions to evaluate the absolute accuracy of architectures, recent approaches have explored various ranking-based loss functions, such as pairwise and listwise ranking losses, to focus on the ranking of architecture performance. Despite their success in NAS, the effectiveness and characteristics of these loss functions have not been thoroughly investigated. In this paper, we conduct the first comprehensive study on loss functions in performance predictors, categorizing them into three main types: regression, ranking, and weighted loss functions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
