Even Sparser Graph Transformers
Hamed Shirzad, Honghao Lin, Balaji Venkatachalam, Ameya Velingker,, David Woodruff, Danica Sutherland

TL;DR
This paper introduces Spexphormer, a two-stage graph transformer training method that reduces memory usage by sparsifying the graph after initial training, supported by theoretical analysis and empirical results.
Contribution
It proposes a novel two-stage training approach for graph transformers that maintains performance while significantly reducing memory requirements.
Findings
Spexphormer achieves comparable accuracy with less memory.
Attention scores are consistent across network widths.
Theoretical conditions support the method's effectiveness.
Abstract
Graph Transformers excel in long-range dependency modeling, but generally require quadratic memory complexity in the number of nodes in an input graph, and hence have trouble scaling to large graphs. Sparse attention variants such as Exphormer can help, but may require high-degree augmentations to the input graph for good performance, and do not attempt to sparsify an already-dense input graph. As the learned attention mechanisms tend to use few of these edges, such high-degree connections may be unnecessary. We show (empirically and with theoretical backing) that attention scores on graphs are usually quite consistent across network widths, and use this observation to propose a two-stage procedure, which we call Spexphormer: first, train a narrow network on the full augmented graph. Next, use only the active connections to train a wider network on a much sparser graph. We establish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Advanced Memory and Neural Computing · Graph Theory and Algorithms
MethodsSoftmax · Attention Is All You Need
