GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction
Dian Chen, Yunkai Chen, Tong Lin, Sijie Chen, and Xiaolin Cheng

TL;DR
GeoPep is a novel geometry-aware model that fine-tunes a multimodal protein foundation model to improve peptide binding site prediction, overcoming data scarcity and peptide flexibility challenges.
Contribution
It introduces GeoPep, a transfer learning framework that leverages pre-trained protein models and 3D structural data for enhanced peptide binding site prediction.
Findings
GeoPep outperforms existing methods in accuracy.
Effective use of 3D structural information improves predictions.
Transfer learning from protein-protein models benefits peptide binding prediction.
Abstract
Multimodal approaches that integrate protein structure and sequence have achieved remarkable success in protein-protein interface prediction. However, extending these methods to protein-peptide interactions remains challenging due to the inherent conformational flexibility of peptides and the limited availability of structural data that hinder direct training of structure-aware models. To address these limitations, we introduce GeoPep, a novel framework for peptide binding site prediction that leverages transfer learning from ESM3, a multimodal protein foundation model. GeoPep fine-tunes ESM3's rich pre-learned representations from protein-protein binding to address the limited availability of protein-peptide binding data. The fine-tuned model is further integrated with a parameter-efficient neural network architecture capable of learning complex patterns from sparse data. Furthermore,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
