Transformers for molecular property prediction: Lessons learned from the past five years
Afnan Sultan, Jochen Sieg, Miriam Mathea, and Andrea Volkamer

TL;DR
This paper reviews five years of research on using transformer models for molecular property prediction, highlighting key insights, challenges, and future directions in the field.
Contribution
It provides a comprehensive analysis of transformer-based models for MPP, discussing training strategies, architecture choices, and evaluation challenges, and identifies gaps for future research.
Findings
Transformers have shown promising results in MPP tasks.
Standardized evaluation methods are needed for fair model comparison.
Key factors include data scale, architecture, and training objectives.
Abstract
Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pre-training data, optimal architecture selections, and promising pre-training objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Various Chemistry Research Topics · Machine Learning in Materials Science
