ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment
Tomer Borreda, Daniel Freedman, Or Litany

TL;DR
ReHub introduces a linear complexity graph transformer that adaptively reassigns nodes to virtual hubs, enabling efficient long-range communication and outperforming existing methods on multiple benchmarks.
Contribution
It proposes an adaptive reassignment technique for virtual nodes in graph transformers, achieving linear complexity without sacrificing performance.
Findings
ReHub outperforms Neural Atoms and other baselines on multiple benchmarks.
The method maintains linear complexity while achieving performance comparable to non-sparse models.
Adaptive reassignment improves efficiency and effectiveness in large-scale graph learning.
Abstract
We present ReHub, a novel graph transformer architecture that achieves linear complexity through an efficient reassignment technique between nodes and virtual nodes. Graph transformers have become increasingly important in graph learning for their ability to utilize long-range node communication explicitly, addressing limitations such as oversmoothing and oversquashing found in message-passing graph networks. However, their dense attention mechanism scales quadratically with the number of nodes, limiting their applicability to large-scale graphs. ReHub draws inspiration from the airline industry's hub-and-spoke model, where flights are assigned to optimize operational efficiency. In our approach, graph nodes (spokes) are dynamically reassigned to a fixed number of virtual nodes (hubs) at each model layer. Recent work, Neural Atoms (Li et al., 2024), has demonstrated impressive and…
Peer Reviews
Decision·Submitted to ICLR 2025
- The proposed method is simple - Experimental results demonstrate that ReHub consistently achieves higher accuracy than existing SOTA methods.
Major: - This paper highlights linear complexity of the proposed algorithm, but the experimental efficiency comparison is missing. - Convergence comparison is missing. ReHub achieves higher accuracy, but it's not clear whether ReHub needs more iterations to converge. - The datasets used in experiments are too small. The largest graph only contains 169K nodes. Minor: - It would be better if the authors can visualize the hub assignment to see if the proposed method generates meaningful pattern.
- The paper is clearly written and easy to follow. - The proposed architecture is well-motivated. - Spokes and Hub is a plausible algorithm for graph learning.
While the proposed method is intriguing, the experimental evaluation lacks comprehensiveness. Several key recent baselines are missing, which undermines the validity of the results. The authors aim to reduce the complexity of self-attention-based graph transformers, yet they only test on a single large dataset, **ogbn-arxiv**. On this dataset, the proposed model does not outperform the vanilla **GraphSAGE** model, which is not even included in the baseline comparisons. In **Table 1**, crucial b
1. The method demonstrates wide applicability to various graph tasks with a minor modification of the prediction head in architecture. 2. The evaluation section covers various datasets and baseline methods to provide comprehensive results on enhanced long-range information capturing ability over existing methods. 3. The writing and presentation is clear and easy to follow.
1. The motivation is intuitive: from the spoke-to-hub transportation model, the motivation of such structure is somewhat insufficient. The Hub Reassignment motivation and strategy are intuitively explained, with little further support. 2. In the evaluation of complexity, only (peak) memory consumption is compared with other models to empirically show the low memory complexity, while the claim is that both time and memory complexity are constrained. The empirical results of time complexity remai
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · DNA and Biological Computing
MethodsAttention Is All You Need · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Softmax · Multi-Head Attention · Byte Pair Encoding · Label Smoothing · Dropout · Laplacian EigenMap
