NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification
Qitian Wu, Wentao Zhao, Zenan Li, David Wipf, Junchi Yan

TL;DR
NodeFormer introduces a scalable Transformer-based approach for node classification that learns adaptive graph structures efficiently, handling large graphs and incomplete data with linear complexity and strong empirical results.
Contribution
The paper proposes a novel all-pair message passing scheme with a kernelized Gumbel-Softmax operator, enabling scalable, differentiable learning of latent graph structures in large, potentially fully-connected graphs.
Findings
Effective on graphs with up to 2 million nodes
Outperforms existing methods in node classification accuracy
Handles missing or incomplete graph data
Abstract
Graph neural networks have been extensively studied for learning with inter-connected data. Despite this, recent evidence has revealed GNNs' deficiencies related to over-squashing, heterophily, handling long-range dependencies, edge incompleteness and particularly, the absence of graphs altogether. While a plausible solution is to learn new adaptive topology for message passing, issues concerning quadratic complexity hinder simultaneous guarantees for scalability and precision in large networks. In this paper, we introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes, as an important building block for a pioneering Transformer-style network for node classification on large graphs, dubbed as \textsc{NodeFormer}. Specifically, the efficient computation is enabled by a kernerlized Gumbel-Softmax operator that reduces the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
