Encodings for Prediction-based Neural Architecture Search

Yash Akhauri; Mohamed S. Abdelfattah

arXiv:2403.02484·cs.LG·March 6, 2024·1 cites

Encodings for Prediction-based Neural Architecture Search

Yash Akhauri, Mohamed S. Abdelfattah

PDF

Open Access 1 Repo 1 Datasets 3 Reviews

TL;DR

This paper investigates various encoding methods for predictor-based neural architecture search, introduces unified encodings for multiple search spaces, and presents a new predictor, FLAN, that significantly reduces training costs.

Contribution

It categorizes neural encodings, extends them to unified forms across search spaces, and develops FLAN, a predictor that greatly improves efficiency in NAS.

Findings

01

Unified encodings enable transfer across search spaces.

02

FLAN reduces predictor training costs by over an order of magnitude.

03

Extensive experiments validate the effectiveness of the proposed methods.

Abstract

Predictor-based methods have substantially enhanced Neural Architecture Search (NAS) optimization. The efficacy of these predictors is largely influenced by the method of encoding neural network architectures. While traditional encodings used an adjacency matrix describing the graph structure of a neural network, novel encodings embrace a variety of approaches from unsupervised pretraining of latent representations to vectors of zero-cost proxies. In this paper, we categorize and investigate neural encodings from three main types: structural, learned, and score-based. Furthermore, we extend these encodings and introduce \textit{unified encodings}, that extend NAS predictors to multiple search spaces. Our analysis draws from experiments conducted on over 1.5 million neural network architectures on NAS spaces such as NASBench-101 (NB101), NB201, NB301, Network Design Spaces (NDS), and…

Peer Reviews

Decision·ICML 2024 Poster

Reviewer 01Rating 8· accept, good paperConfidence 5

Strengths

- The motivation to have an unified encoding across NAS spaces is important and as the authors mention, this is relevant when it comes to transfer learning across spaces and tasks. - The authors propose a new hybrid encoder that outperforms prior encodings and allows transferrability of predictors to new search spaces. This leads to improved sample efficiency compared to training predictors from scratch on a new search space. - Large-scale study of NAS encodings over 13 NAS search spaces with

Weaknesses

- It seems that the performance predictor is transferable across search spaces, and can relatively predict the ranking good. However, as far as I saw this is done only on CIFAR-10, right? That means that if one wants to transfer a learned predictor on a new dataset, that would not be feasible with FLAN, or otherwise one would need to train FLAN on the said dataset from scratch. - Other than this, I do not have any major weaknesses regarding this paper. I think that this is an important work for

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The paper is well-written and demonstrates an integration of ideas from prior work, further enhancing them to achieve SOTA sample efficiency in performance prediction and in sample-based NAS. Additionally, it shows superior Kendal tau correlation in performance prediction. The method also permits integration of new encodings, and hence allows to take advantage of developments in architecture encoding, such as new ZCPs. Furthermore, the unified encoding facilitates transfer learning across differ

Weaknesses

- The Authors state that improvements gained in sample efficiency by pre-training FLAN on a source search space do not include the cost to pre-train the model. However, this makes sense only if the pre-training is done on a single source space and then transferred to any other space. From table 4, this doesn’t seem to be the case, and each target space has its own source space. - Experiments on sample-based NAS are limited to NB101. Extending this (for example to NB201) would give a better asse

Reviewer 03Rating 3· reject, not good enoughConfidence 5

Strengths

Generalzing NAS predictors to cover multiple search spaces is essential and an important step forward. For the most part, the paper is easy to read and follow early on. Figures 1-3 especially are nicely done. There is detailed ablation on the design aspects of FLAN. Transfer experiments are performed, as is search.

Weaknesses

The are issues with the contributions and statements made in this manuscript: First, DGF: The author's point out that "GCNs are prone to an over-smoothing problem", although really this issue affects Graph Neural Networks (GNNs) in general. The author's then attempt to validate the efficacy of FLAN's predictor using the DGF and GAT in Table 1. I am not convinced by these experiments. GCN was proposed in 2016 and since then other GNN-types like GAT, GIN [1], GATv2 [2], etc., all of whose manuscr

Code & Models

Repositories

abdelfattah-lab/flan_nas
pytorchOfficial

Datasets

akhauriyash/GraphArch-Regression
dataset· 91 dl
91 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Neural Networks and Applications · Time Series Analysis and Forecasting