ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment

Tomer Borreda; Daniel Freedman; Or Litany

arXiv:2412.01519·cs.LG·August 26, 2025

ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment

Tomer Borreda, Daniel Freedman, Or Litany

PDF

Open Access 3 Reviews

TL;DR

ReHub introduces a linear complexity graph transformer that adaptively reassigns nodes to virtual hubs, enabling efficient long-range communication and outperforming existing methods on multiple benchmarks.

Contribution

It proposes an adaptive reassignment technique for virtual nodes in graph transformers, achieving linear complexity without sacrificing performance.

Findings

01

ReHub outperforms Neural Atoms and other baselines on multiple benchmarks.

02

The method maintains linear complexity while achieving performance comparable to non-sparse models.

03

Adaptive reassignment improves efficiency and effectiveness in large-scale graph learning.

Abstract

We present ReHub, a novel graph transformer architecture that achieves linear complexity through an efficient reassignment technique between nodes and virtual nodes. Graph transformers have become increasingly important in graph learning for their ability to utilize long-range node communication explicitly, addressing limitations such as oversmoothing and oversquashing found in message-passing graph networks. However, their dense attention mechanism scales quadratically with the number of nodes, limiting their applicability to large-scale graphs. ReHub draws inspiration from the airline industry's hub-and-spoke model, where flights are assigned to optimize operational efficiency. In our approach, graph nodes (spokes) are dynamically reassigned to a fixed number of virtual nodes (hubs) at each model layer. Recent work, Neural Atoms (Li et al., 2024), has demonstrated impressive and…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 3

Strengths

- The proposed method is simple - Experimental results demonstrate that ReHub consistently achieves higher accuracy than existing SOTA methods.

Weaknesses

Major: - This paper highlights linear complexity of the proposed algorithm, but the experimental efficiency comparison is missing. - Convergence comparison is missing. ReHub achieves higher accuracy, but it's not clear whether ReHub needs more iterations to converge. - The datasets used in experiments are too small. The largest graph only contains 169K nodes. Minor: - It would be better if the authors can visualize the hub assignment to see if the proposed method generates meaningful pattern.

Reviewer 02Rating 3Confidence 5

Strengths

- The paper is clearly written and easy to follow. - The proposed architecture is well-motivated. - Spokes and Hub is a plausible algorithm for graph learning.

Weaknesses

While the proposed method is intriguing, the experimental evaluation lacks comprehensiveness. Several key recent baselines are missing, which undermines the validity of the results. The authors aim to reduce the complexity of self-attention-based graph transformers, yet they only test on a single large dataset, **ogbn-arxiv**. On this dataset, the proposed model does not outperform the vanilla **GraphSAGE** model, which is not even included in the baseline comparisons. In **Table 1**, crucial b

Reviewer 03Rating 5Confidence 4

Strengths

1. The method demonstrates wide applicability to various graph tasks with a minor modification of the prediction head in architecture. 2. The evaluation section covers various datasets and baseline methods to provide comprehensive results on enhanced long-range information capturing ability over existing methods. 3. The writing and presentation is clear and easy to follow.

Weaknesses

1. The motivation is intuitive: from the spoke-to-hub transportation model, the motivation of such structure is somewhat insufficient. The Hub Reassignment motivation and strategy are intuitively explained, with little further support. 2. In the evaluation of complexity, only (peak) memory consumption is compared with other models to empirically show the low memory complexity, while the claim is that both time and memory complexity are constrained. The empirical results of time complexity remai

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · DNA and Biological Computing

MethodsAttention Is All You Need · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Softmax · Multi-Head Attention · Byte Pair Encoding · Label Smoothing · Dropout · Laplacian EigenMap