Machine Learning Scoring Functions for Drug Discoveries from Experimental and Computer-Generated Protein-Ligand Structures: Towards Per-Target Scoring Functions
F. Pellicani, D. Dal Ben, A. Perali, S. Pilati

TL;DR
This study evaluates machine learning-based scoring functions for drug discovery, comparing experimental and computer-generated protein-ligand structures, and explores the development of per-target scoring functions with promising results.
Contribution
It demonstrates that neural networks perform similarly on experimental and computer-generated data and introduces per-target scoring functions tailored to individual proteins.
Findings
Neural networks show comparable performance on experimental and generated structures.
Performance drops when testing on unseen target proteins.
Per-target models yield encouraging results depending on protein type.
Abstract
In recent years, machine learning has been proposed as a promising strategy to build accurate scoring functions for computational docking finalized to numerically empowered drug discovery. However, the latest studies have suggested that over-optimistic results had been reported due to the correlations present in the experimental databases used for training and testing. Here, we investigate the performance of an artificial neural network in binding affinity predictions, comparing results obtained using both experimental protein-ligand structures as well as larger sets of computer-generated structures created using commercial software. Interestingly, similar performances are obtained on both databases. We find a noticeable performance suppression when moving from random horizontal tests to vertical tests performed on target proteins not included in the training data. The possibility to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
