On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction
Nikolai Schapin, Carles Navarro, Albert Bou, Gianni De Fabritiis

TL;DR
This paper benchmarks various machine learning models, including 2D and 3D neural networks, for predicting protein-ligand binding affinity, revealing that simpler models can outperform complex ones in certain scenarios and that combining approaches enhances active learning.
Contribution
It provides a comprehensive comparison of classical and advanced ML models for binding affinity prediction, highlighting the benefits of pretraining and multi-modal approaches in drug discovery.
Findings
Simpler models can outperform complex models in specific tasks.
Pre-trained 3D models excel in data-scarce scenarios with structural info.
Combining 2D and 3D models improves active learning performance.
Abstract
Binding affinity optimization is crucial in early-stage drug discovery. While numerous machine learning methods exist for predicting ligand potency, their comparative efficacy remains unclear. This study evaluates the performance of classical tree-based models and advanced neural networks in protein-ligand binding affinity prediction. Our comprehensive benchmarking encompasses 2D models utilizing ligand-only RDKit embeddings and Large Language Model (LLM) ligand representations, as well as 3D neural networks incorporating bound protein-ligand conformations. We assess these models across multiple standard datasets, examining various predictive scenarios including classification, ranking, regression, and active learning. Results indicate that simpler models can surpass more complex ones in specific tasks, while 3D models leveraging structural information become increasingly competitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Monoclonal and Polyclonal Antibodies Research · Genetics, Bioinformatics, and Biomedical Research
