Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei, Zimu Li, Chih-chan Tien, Risi Kondor, Aaron R. Dinner

TL;DR
This paper introduces SubFormer, a hierarchical graph transformer that efficiently processes chemical graphs by operating on subgraphs, reducing computational costs while maintaining competitive accuracy in molecular property prediction.
Contribution
SubFormer is a novel hierarchical transformer that reduces complexity by using subgraphs and message passing, improving efficiency over existing graph transformers.
Findings
SubFormer achieves competitive performance on chemical property benchmarks.
It significantly reduces training time to minutes on a consumer GPU.
The model exhibits limited over-smoothing and avoids over-squashing.
Abstract
Transformers, adapted from natural language processing, are emerging as a leading approach for graph representation learning. Contemporary graph transformers often treat nodes or edges as separate tokens. This approach leads to computational challenges for even moderately-sized graphs due to the quadratic scaling of self-attention complexity with token count. In this paper, we introduce SubFormer, a graph transformer that operates on subgraphs that aggregate information by a message-passing mechanism. This approach reduces the number of tokens and enhances learning long-range interactions. We demonstrate SubFormer on benchmarks for predicting molecular properties from chemical structures and show that it is competitive with state-of-the-art graph transformers at a fraction of the computational cost, with training times on the order of minutes on a consumer-grade graphics card. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Computational Drug Discovery Methods
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Dense Connections · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Laplacian EigenMap
