Efficient Training of Transformers for Molecule Property Prediction on Small-scale Datasets
Shivesh Prakash

TL;DR
This paper introduces a GPS Transformer model with Self Attention tailored for small datasets, achieving state-of-the-art accuracy in predicting blood-brain barrier permeability, which is crucial for drug development.
Contribution
The paper presents a novel GPS Transformer architecture with Self Attention optimized for low-data scenarios, outperforming existing models in BBB permeability prediction.
Findings
Achieved ROC-AUC of 78.8% on BBBP dataset
Surpassed previous models by 5.5% ROC-AUC
Standard Self Attention outperforms other attention variants
Abstract
The blood-brain barrier (BBB) serves as a protective barrier that separates the brain from the circulatory system, regulating the passage of substances into the central nervous system. Assessing the BBB permeability of potential drugs is crucial for effective drug targeting. However, traditional experimental methods for measuring BBB permeability are challenging and impractical for large-scale screening. Consequently, there is a need to develop computational approaches to predict BBB permeability. This paper proposes a GPS Transformer architecture augmented with Self Attention, designed to perform well in the low-data regime. The proposed approach achieved a state-of-the-art performance on the BBB permeability prediction task using the BBBP dataset, surpassing existing models. With a ROC-AUC of 78.8%, the approach sets a state-of-the-art by 5.5%. We demonstrate that standard Self…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods
MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Greedy Policy Search · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer
