Graph is all you need? Lightweight data-agnostic neural architecture search without training

Zhenhan Huang; Tejaswini Pedapati; Pin-Yu Chen; Chunheng Jiang; Jianxi Gao

arXiv:2405.01306·cs.LG·June 23, 2025

Graph is all you need? Lightweight data-agnostic neural architecture search without training

Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Chunheng Jiang, Jianxi Gao

PDF

Open Access 3 Reviews

TL;DR

This paper introduces nasgraph, a lightweight, training-free neural architecture search method that uses graph measures as proxies, significantly reducing computational costs while maintaining competitive performance across multiple datasets.

Contribution

The paper presents a novel, data-agnostic NAS approach that replaces training with graph-based proxies, enabling rapid architecture search without performance evaluation training.

Findings

01

Finds the best architecture among 200 samples in 217 CPU seconds.

02

Achieves competitive results on NASBench-101, NASBench-201, and NDS.

03

Generalizes well to Micro TransNAS-Bench-101.

Abstract

Neural architecture search (NAS) enables the automatic design of neural network models. However, training the candidates generated by the search algorithm for performance evaluation incurs considerable computational overhead. Our method, dubbed nasgraph, remarkably reduces the computational costs by converting neural architectures to graphs and using the average degree, a graph measure, as the proxy in lieu of the evaluation metric. Our training-free NAS method is data-agnostic and light-weight. It can find the best architecture among 200 randomly sampled architectures from NAS-Bench201 in 217 CPU seconds. Besides, our method is able to achieve competitive performance on various datasets including NASBench-101, NASBench-201, and NDS search spaces. We also demonstrate that nasgraph generalizes to more challenging tasks on Micro TransNAS-Bench-101.

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 3· reject, not good enoughConfidence 4

Strengths

- The paper in general is well written and the results/evaluation well presented - Results on the NAS-Bench-Suite-Zero tasks are quite competitive. Evaluation of complementary nature of the proxy, from table 5 is interesting. - The method is fairly novel and interesting

Weaknesses

- I went through the example in Figure 1 and the caption and the example itself is still unclear to me. Could the authors please elaborate on this? - I find the evaluation of the method quite weak especially since the authors do not compare against the MeCo proxy https://openreview.net/pdf?id=KFm2lZiI7n. On an initial glance it seems that MeCo outperforms the proposed proxy on most of the benchmarks. The code for MeCo is publicly accessible and I encourage the authors to compare their work with

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

- While this is not the first work to perform architecture search with graph-based representations of neural architectures, I believe the proposed approach is far more thorough in incorporating the actual computations that occur within neural architectures. NASGraph goes above and beyond simply converting neural architectures into DAGs by considering how the inputs are being processed and mapped to outputs during the forward propagation process. - The final proxy metric to the validation accurac

Weaknesses

- It appears that NASGraph assumes the inputs to the neural architecture and subsequent graph blocks is non-negative. Does the analysis hold even with non-negative inputs? Many of modern neural architectures utilize activation functions that could yield negative inputs/outputs (e.g., gelu activation). Can NASGraph generalize beyond relu-based architectures? - In a similar vein, can the proposed method generalize to non-conv-based architecture spaces? Such as ViTs and MLPMixers? - The authors def

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- The proposed method is simple and easy to understand. - The proposed conversion method seems novel. - The authors suggest techniques for further speeding up the proposed method by using surrogate models. - The empirical results show the superiority of the proposed method. - The paper contains the comparison among various graph measures in its appendix.

Weaknesses

- I think the paper needs more justification about the importance of the training-free data-agnostic NAS. - The experimental results are evaluated on only simple benchmarks. To show the practicality of the proposed method, I think a it would be more helpful if you provide a comparison with other non-training-free NAS methods such as DARTS-PT on a larger search space such as DARTS space.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Advanced Graph Neural Networks