Global optimization of graph acquisition functions for neural architecture search

Yilin Xie; Shiqiang Zhang; Jixiang Qing; Ruth Misener; Calvin Tsay

arXiv:2505.23640·cs.LG·May 30, 2025

Global optimization of graph acquisition functions for neural architecture search

Yilin Xie, Shiqiang Zhang, Jixiang Qing, Ruth Misener, Calvin Tsay

PDF

Open Access 3 Reviews

TL;DR

This paper introduces explicit optimization formulations for graph input spaces in neural architecture search, enabling more effective graph Bayesian optimization by addressing the complexity of discrete graph search spaces.

Contribution

It proposes novel explicit optimization formulations for graph spaces in NAS, with theoretical proofs and practical algorithms for improved graph BO performance.

Findings

01

Efficiently finds optimal architectures on NAS benchmarks.

02

Theoretically proven to be equivalent representations of graph space.

03

Outperforms existing methods in several NAS tasks.

Abstract

Graph Bayesian optimization (BO) has shown potential as a powerful and data-efficient tool for neural architecture search (NAS). Most existing graph BO works focus on developing graph surrogates models, i.e., metrics of networks and/or different kernels to quantify the similarity between networks. However, the acquisition optimization, as a discrete optimization task over graph structures, is not well studied due to the complexity of formulating the graph search space and acquisition functions. This paper presents explicit optimization formulations for graph input space including properties such as reachability and shortest paths, which are used later to formulate graph kernels and the acquisition function. We theoretically prove that the proposed encoding is an equivalent representation of the graph space and provide restrictions for the NAS domain with either node or edge labels.…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

++ This method extends graph BO to NAS by relaxing the strong connectivity assumption of BoGrape. ++ Comprehensive experiments on three major NAS benchmarks under both deterministic and noisy settings demonstrate robustness and efficiency.

Weaknesses

-- The MIP encoding for graph structures builds heavily on BoGrape, with the main adaptation being the relaxation of strong connectivity. While this is non-trivial, the paper could better highlight what specific constraints were modified or added to handle NAS-specific DAGs. Specifically, the claim that BoGrape is unsuitable due to strong connectivity is not followed by a clear explanation of how this is resolved beyond "generalizing the graph encoding." -- I am afraid that this method is not

Reviewer 02Rating 4Confidence 3

Strengths

1. The paper is clearly written and easy to follow. 2. The authors design a full condition plan of NAS graph space. 3. The code is supplied, and the hyper-parameters are reported.

Weaknesses

1. The complexity of the method should be analyzed. 2. The main content in Theorem 1 is more likely a modeling plan of the graph space, but it takes too much space in the paper, which makes readers uncomfortable. In addition, Theorem 1 is unnecessary to be a theorem. 3. The experiments are all conducted on NB101~301, it is better to evaluate the method on more datasets. Besides, the method cannot achieve SOTA in some of cases.

Reviewer 03Rating 8Confidence 2

Strengths

1. The paper proposes an equivalent representation of general labeled graphs in the optimization variable space, ensuring that each graph corresponds to a unique feasible solution. Moreover, it introduces a unified kernel formulation that quantifies the similarity between two labeled graphs at the levels of graph structure, node labels, and edge labels. The advantages over baselines were demonstrated in NAS Bench 101, NAS Bench 201, and NAS Bench 301. 2. The formulas and derivation proofs in th

Weaknesses

1. The benchmarks used (NAS Bench 101, NAS Bench 201, and NAS Bench 301) are all from before 2022. Similarly, the baseline methods such as GCN, NAS BOT, and NAS BOWL are also from before 2021. No experiments were conducted on the latest benchmarks or with more recent baseline methods. 2. This paper lacks an analysis of the algorithm's time complexity. 3. The evaluated benchmark is limited to NAS, lacking experiments on real-world tasks, which makes the contribution relatively limited.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Advanced Neural Network Applications · Machine Learning and Data Classification

MethodsFocus