HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers
Maciej Besta, Afonso Claudino Catarino, Lukas Gianinazzi, Nils Blach,, Piotr Nyczyk, Hubert Niewiadomski, Torsten Hoefler

TL;DR
HOT introduces a higher-order graph representation learning model using efficient Transformers, improving dynamic link prediction accuracy by leveraging subgraph structures while balancing memory use.
Contribution
The paper presents HOT, a novel Transformer-based model that incorporates higher-order graph structures for improved dynamic link prediction accuracy with efficient memory management.
Findings
HOT achieves 9% higher accuracy than DyGFormer on MOOC dataset.
HOT outperforms TGN and GraphMixer by 7% and 15%, respectively.
The hierarchical attention scheme reduces memory footprint significantly.
Abstract
Many graph representation learning (GRL) problems are dynamic, with millions of edges added or removed per second. A fundamental workload in this setting is dynamic link prediction: using a history of graph updates to predict whether a given pair of vertices will become connected. Recent schemes for link prediction in such dynamic settings employ Transformers, modeling individual graph updates as single tokens. In this work, we propose HOT: a model that enhances this line of works by harnessing higher-order (HO) graph structures; specifically, k-hop neighbors and more general subgraphs containing a given pair of vertices. Harnessing such HO structures by encoding them into the attention matrix of the underlying Transformer results in higher accuracy of link prediction outcomes, but at the expense of increased memory pressure. To alleviate this, we resort to a recent class of schemes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Data Quality and Management
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Absolute Position Encodings
