Causal-aware Graph Neural Architecture Search under Distribution Shifts
Peiwen Li, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Jialong, Wang, Yang Li, Wenwu Zhu

TL;DR
This paper introduces CARNAS, a causal-aware neural architecture search method for graphs that improves generalization under distribution shifts by discovering and leveraging stable causal relationships between graphs and architectures.
Contribution
The paper proposes a novel causal graph-architecture search framework that captures stable causal subgraphs and intervenes on them to enhance out-of-distribution generalization.
Findings
CARNAS outperforms existing methods in out-of-distribution scenarios.
Disentangled Causal Subgraph Identification captures stable causal features.
Graph Embedding Intervention improves model robustness.
Abstract
Graph NAS has emerged as a promising approach for autonomously designing GNN architectures by leveraging the correlations between graphs and architectures. Existing methods fail to generalize under distribution shifts that are ubiquitous in real-world graph scenarios, mainly because the graph-architecture correlations they exploit might be spurious and varying across distributions. We propose to handle the distribution shifts in the graph architecture search process by discovering and exploiting the causal relationship between graphs and architectures to search for the optimal architectures that can generalize under distribution shifts. The problem remains unexplored with following challenges: how to discover the causal graph-architecture relationship that has stable predictive abilities across distributions, and how to handle distribution shifts with the discovered causal…
Peer Reviews
Decision·Submitted to ICLR 2025
1. Well-structured modular approach: The CARNAS framework is thoughtfully organized, with each component clearly contributing to improved generalization under distribution shifts. 2. Robust experimentation: The paper includes extensive experiments across synthetic and real-world datasets, highlighting the robustness of the proposed method. 3. Component-level contribution clarity: Each module’s individual contribution is demonstrated, providing transparency and supporting the effectiveness of t
1. Clarity in Section 3.3: Given I’m having limited familiarity with Graph NAS, the dynamic graph neural network architecture production and optimization process described in Section 3.3 remains somewhat unclear for me. A visual representation and a more detailed explanation would significantly improve the paper's readability. 2. Causal-Aware Solution's Justification: While the paper presents a causal-aware solution for handling distribution shifts, some aspects require stronger theoretical sup
1. The paper is the first to study Graph NAS under distribution shifts using causality. The problem is well-motivated with clear real-world relevance and applications. 2. The authors present comprehensive experiments on both synthetic and real-world datasets that demonstrate clear performance improvements over existing baselines. The thorough ablation studies effectively validate each component of the proposed method, and the analysis provides valuable insights into the model's behavior. 3. The
1. While the paper shows good performance on the tested datasets, it lacks a detailed analysis of computational complexity and memory requirements. Specifically, the time complexity of $O(|E|(d_0 + d_1 + |O|d_s) + |V|(d_0^2 + d_1^2 + |O|d_s^2) + |O|^2d_1)$ could become prohibitive for very large graphs. The authors should discuss how their method performs on graphs with millions of nodes and edges, which are common in real-world applications like social networks. 2. The method requires careful t
1. This paper innovatively proposes using NAS to address the problem of causal information identification in graph data. 2. The paper conducts extensive experiments to validate the proposed method.
1. I believe the paper does not clearly explain why NAS can help adjust GNNs to identify causal information, which I consider the main issue of the paper. In my view, NAS optimizes the structure of GNNs, enhancing their efficiency or expressiveness, but it does not inherently enable GNNs to determine what type of data to model. At the very least, the authors did not provide a clear explanation of this point in the paper. 2. The paper lacks theoretical justification for the regulatory capabilit
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Bayesian Modeling and Causal Inference · Neural Networks and Applications
