Graph-Aware Transformer: Is Attention All Graphs Need?
Sanghyun Yoo, Young-Seok Kim, Kang Hyun Lee, Kuhwan Jeong, Junhwi, Choi, Hoshik Lee, Young Sang Choi

TL;DR
This paper introduces GRAT, a novel Transformer-based model designed to encode and decode entire graphs, effectively handling their non-sequential nature and achieving state-of-the-art results in molecular property prediction and graph generation.
Contribution
GRAT is the first Transformer model capable of end-to-end graph encoding and decoding, with adaptive self-attention and a two-path auto-regressive decoding mechanism.
Findings
Achieved state-of-the-art performance on 4 QM9 regression tasks.
Demonstrated effectiveness in molecule graph generation.
Showed promising results across multiple graph-related tasks.
Abstract
Graphs are the natural data structure to represent relational and structural information in many domains. To cover the broad range of graph-data applications including graph classification as well as graph generation, it is desirable to have a general and flexible model consisting of an encoder and a decoder that can handle graph data. Although the representative encoder-decoder model, Transformer, shows superior performance in various tasks especially of natural language processing, it is not immediately available for graphs due to their non-sequential characteristics. To tackle this incompatibility, we propose GRaph-Aware Transformer (GRAT), the first Transformer-based model which can encode and decode whole graphs in end-to-end fashion. GRAT is featured with a self-attention mechanism adaptive to the edge information and an auto-regressive decoding mechanism based on the two-path…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Advanced Graph Neural Networks
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding
