Machine Learning Small Molecule Properties in Drug Discovery
Nikolai Schapin, Maciej Majewski, Alejandro Varela, Carlos Arroniz,, Gianni De Fabritiis

TL;DR
This paper reviews machine learning methods for predicting small molecule properties in drug discovery, highlighting datasets, models, challenges, and the need for standardized benchmarks to improve model comparison and performance.
Contribution
It provides a comprehensive overview of recent ML approaches, datasets, and challenges in predicting small molecule properties, emphasizing the importance of data quality and benchmarking.
Findings
Multiple ML approaches have similar performance levels.
Neural networks do not always outperform simpler models.
High-quality data is crucial for accurate predictions.
Abstract
Machine learning (ML) is a promising approach for predicting small molecule properties in drug discovery. Here, we provide a comprehensive overview of various ML methods introduced for this purpose in recent years. We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity). We discuss existing popular datasets and molecular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. We highlight also challenges of predicting and optimizing multiple properties during hit-to-lead and lead optimization stages of drug discovery and explore briefly possible multi-objective optimization techniques that can be used to balance diverse properties while optimizing lead candidates. Finally, techniques to provide an understanding of model predictions, especially for critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Chemistry and Chemical Engineering · Machine Learning in Materials Science
