Curve Your Attention: Mixed-Curvature Transformers for Graph   Representation Learning

Sungjun Cho; Seunghyuk Cho; Sungwoo Park; Hankook Lee; Honglak Lee,; Moontae Lee

arXiv:2309.04082·cs.LG·September 11, 2023

Curve Your Attention: Mixed-Curvature Transformers for Graph Representation Learning

Sungjun Cho, Seunghyuk Cho, Sungwoo Park, Hankook Lee, Honglak Lee,, Moontae Lee

PDF

Open Access

TL;DR

This paper introduces a novel non-Euclidean Transformer model that operates on product of constant curvature spaces, enabling better representation of hierarchical and cyclical graph structures, with efficient attention mechanisms.

Contribution

It proposes the Fully Product-Stereographic Transformer, a generalization of Transformers to non-Euclidean geometries, capable of learning graph curvature end-to-end without extra tuning.

Findings

01

Improved graph reconstruction accuracy

02

Enhanced node classification performance

03

Linear time and memory complexity for non-Euclidean attention

Abstract

Real-world graphs naturally exhibit hierarchical or cyclical structures that are unfit for the typical Euclidean space. While there exist graph neural networks that leverage hyperbolic or spherical spaces to learn representations that embed such structures more accurately, these methods are confined under the message-passing paradigm, making the models vulnerable against side-effects such as oversmoothing and oversquashing. More recent work have proposed global attention-based graph Transformers that can easily model long-range interactions, but their extensions towards non-Euclidean geometry are yet unexplored. To bridge this gap, we propose Fully Product-Stereographic Transformer, a generalization of Transformers towards operating entirely on the product of constant curvature spaces. When combined with tokenized graph Transformers, our model can learn the curvature appropriate for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Brain Tumor Detection and Classification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings