ProMol_Func: A Structure-Free Deep Learning Model for Virtual Screening
Zixuan Feng, Max Kim, Aweon Richards, Tania J. Lupoli, Yingkai Zhang

TL;DR
ProMol_Func is a new deep learning model that can screen for drug candidates without needing protein structures, and it performs well even on new targets.
Contribution
ProMol_Func introduces a structure-free deep learning framework using molecule graphs and protein function embeddings for improved virtual screening.
Findings
ProMol_Func achieves an EF1% of 10.9 on the LIT-PCBA benchmark, showing strong screening performance.
The model successfully identified inhibitors for E. coli DnaK, a target not in the training data.
ProMol_Func improves generalization by using experimentally validated inactives and random decoys in training.
Abstract
In computational-aided drug discovery, structure-based drug design models are computationally intensive and rely on protein structures, limiting their scalability and generalization. Additionally, many existing models suffer from inflated false-positive rates due to the scarcity of negative binding data for training. To overcome these challenges, we present ProMol_Func, a structure-free deep learning framework that integrates graph-based encodings of small molecules with protein function embeddings derived solely from amino acid sequences. By augmenting the training data set with both experimentally validated inactives and randomly selected decoys, ProMol_Func improves screening power and generalization. The model achieves state-of-the-art performance on the challenging LIT-PCBA (Library of Integrated Targeted-Panel of Cell-Based Assays) benchmark, with an enrichment factor (EF1%) of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Machine Learning in Bioinformatics
