Graph-Aware Transformer: Is Attention All Graphs Need?

Sanghyun Yoo; Young-Seok Kim; Kang Hyun Lee; Kuhwan Jeong; Junhwi; Choi; Hoshik Lee; Young Sang Choi

arXiv:2006.05213·cs.LG·June 11, 2020·6 cites

Graph-Aware Transformer: Is Attention All Graphs Need?

Sanghyun Yoo, Young-Seok Kim, Kang Hyun Lee, Kuhwan Jeong, Junhwi, Choi, Hoshik Lee, Young Sang Choi

PDF

Open Access

TL;DR

This paper introduces GRAT, a novel Transformer-based model designed to encode and decode entire graphs, effectively handling their non-sequential nature and achieving state-of-the-art results in molecular property prediction and graph generation.

Contribution

GRAT is the first Transformer model capable of end-to-end graph encoding and decoding, with adaptive self-attention and a two-path auto-regressive decoding mechanism.

Findings

01

Achieved state-of-the-art performance on 4 QM9 regression tasks.

02

Demonstrated effectiveness in molecule graph generation.

03

Showed promising results across multiple graph-related tasks.

Abstract

Graphs are the natural data structure to represent relational and structural information in many domains. To cover the broad range of graph-data applications including graph classification as well as graph generation, it is desirable to have a general and flexible model consisting of an encoder and a decoder that can handle graph data. Although the representative encoder-decoder model, Transformer, shows superior performance in various tasks especially of natural language processing, it is not immediately available for graphs due to their non-sequential characteristics. To tackle this incompatibility, we propose GRaph-Aware Transformer (GRAT), the first Transformer-based model which can encode and decode whole graphs in end-to-end fashion. GRAT is featured with a self-attention mechanism adaptive to the edge information and an auto-regressive decoding mechanism based on the two-path…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Advanced Graph Neural Networks

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding