Cross-Modality Protein Embedding for Compound-Protein Affinity and Contact Prediction
Yuning You, Yang Shen

TL;DR
This paper introduces a cross-modality protein embedding approach that combines amino-acid sequences and predicted residue contact maps to improve compound-protein affinity and contact prediction, aiding drug discovery.
Contribution
It proposes a novel cross-modality embedding model with interaction mechanisms that outperforms existing methods in predicting compound-protein interactions and contacts.
Findings
Cross-modality embedding improves prediction accuracy.
Interaction-based models outperform single modality models.
Model generalizes well to unseen proteins.
Abstract
Compound-protein pairs dominate FDA-approved drug-target pairs and the prediction of compound-protein affinity and contact (CPAC) could help accelerate drug discovery. In this study we consider proteins as multi-modal data including 1D amino-acid sequences and (sequence-predicted) 2D residue-pair contact maps. We empirically evaluate the embeddings of the two single modalities in their accuracy and generalizability of CPAC prediction (i.e. structure-free interpretable compound-protein affinity prediction). And we rationalize their performances in both challenges of embedding individual modalities and learning generalizable embedding-label relationship. We further propose two models involving cross-modality protein embedding and establish that the one with cross interaction (thus capturing correlations among modalities) outperforms SOTAs and our single modality models in affinity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Machine Learning in Bioinformatics
