TL;DR
PDeepPP is a versatile deep learning framework that leverages pretrained protein language models and hybrid architectures to accurately identify diverse peptide functions and post-translational modifications, advancing biomedical research.
Contribution
This work introduces PDeepPP, a unified deep learning model that improves generalizability and performance in peptide function and PTM identification across multiple biological tasks.
Findings
Achieves state-of-the-art performance in 25 of 33 tasks
High accuracy in antimicrobial and phosphorylation site prediction
Significant reduction in false negatives for antimalarial peptides
Abstract
Accurate identification of bioactive peptides (BPs) and protein post-translational modifications (PTMs) is essential for understanding protein function and advancing therapeutic discovery. However, most computational methods remain limited in their generalizability across diverse peptide functions. Here, we present PDeepPP, a unified deep learning framework that integrates pretrained protein language models with a hybrid transformer-CNN architecture, enabling robust identification across diverse peptide classes and PTM sites. We curated comprehensive benchmark datasets and implemented strategies to address data imbalance, allowing PDeepPP to systematically extract both global and local sequence features. Through extensive analyses including dimensionality reduction and comparison studies, PDeepPP demonstrates strong, interpretable peptide representations and achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
