A general language model for peptide function identification

Jixiu Zhai; Zikun Wang; Chupei Tang; Haitian Zhong; Ziyang Xu; Yuhuan Liu; Shengrui Xu; Jingwan Wang; Dan Huang; Tianchi Lu

arXiv:2502.15610·cs.LG·December 5, 2025

A general language model for peptide function identification

Jixiu Zhai, Zikun Wang, Chupei Tang, Haitian Zhong, Ziyang Xu, Yuhuan Liu, Shengrui Xu, Jingwan Wang, Dan Huang, Tianchi Lu

PDF

1 Repo 1 Models

TL;DR

PDeepPP is a versatile deep learning framework that leverages pretrained protein language models and hybrid architectures to accurately identify diverse peptide functions and post-translational modifications, advancing biomedical research.

Contribution

This work introduces PDeepPP, a unified deep learning model that improves generalizability and performance in peptide function and PTM identification across multiple biological tasks.

Findings

01

Achieves state-of-the-art performance in 25 of 33 tasks

02

High accuracy in antimicrobial and phosphorylation site prediction

03

Significant reduction in false negatives for antimalarial peptides

Abstract

Accurate identification of bioactive peptides (BPs) and protein post-translational modifications (PTMs) is essential for understanding protein function and advancing therapeutic discovery. However, most computational methods remain limited in their generalizability across diverse peptide functions. Here, we present PDeepPP, a unified deep learning framework that integrates pretrained protein language models with a hybrid transformer-CNN architecture, enabling robust identification across diverse peptide classes and PTM sites. We curated comprehensive benchmark datasets and implemented strategies to address data imbalance, allowing PDeepPP to systematically extract both global and local sequence features. Through extensive analyses including dimensionality reduction and comparison studies, PDeepPP demonstrates strong, interpretable peptide representations and achieves state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fondress/pdeeppp
pytorchOfficial

Models

🤗
fondress/PDeepPP
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.