Curve Your Attention: Mixed-Curvature Transformers for Graph Representation Learning
Sungjun Cho, Seunghyuk Cho, Sungwoo Park, Hankook Lee, Honglak Lee,, Moontae Lee

TL;DR
This paper introduces a novel non-Euclidean Transformer model that operates on product of constant curvature spaces, enabling better representation of hierarchical and cyclical graph structures, with efficient attention mechanisms.
Contribution
It proposes the Fully Product-Stereographic Transformer, a generalization of Transformers to non-Euclidean geometries, capable of learning graph curvature end-to-end without extra tuning.
Findings
Improved graph reconstruction accuracy
Enhanced node classification performance
Linear time and memory complexity for non-Euclidean attention
Abstract
Real-world graphs naturally exhibit hierarchical or cyclical structures that are unfit for the typical Euclidean space. While there exist graph neural networks that leverage hyperbolic or spherical spaces to learn representations that embed such structures more accurately, these methods are confined under the message-passing paradigm, making the models vulnerable against side-effects such as oversmoothing and oversquashing. More recent work have proposed global attention-based graph Transformers that can easily model long-range interactions, but their extensions towards non-Euclidean geometry are yet unexplored. To bridge this gap, we propose Fully Product-Stereographic Transformer, a generalization of Transformers towards operating entirely on the product of constant curvature spaces. When combined with tokenized graph Transformers, our model can learn the curvature appropriate for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Brain Tumor Detection and Classification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings
