DeepProtein: Deep Learning Library and Benchmark for Protein Sequence Learning
Jiaqing Xie, Tianfan Fu

TL;DR
DeepProtein is a new deep learning library and benchmark suite designed for protein sequence analysis, enabling researchers to evaluate various models across multiple protein-related tasks with state-of-the-art results.
Contribution
It introduces a comprehensive library and benchmark for protein tasks, along with DeepProt-T5 models that achieve top performance, advancing protein sequence learning research.
Findings
DeepProt-T5 models achieve state-of-the-art results on four tasks.
Benchmark evaluates multiple architectures across diverse protein tasks.
DeepProtein is publicly available with extensive documentation.
Abstract
Deep learning has deeply influenced protein science, enabling breakthroughs in predicting protein properties, higher-order structures, and molecular interactions. This paper introduces DeepProtein, a comprehensive and user-friendly deep learning library tailored for protein-related tasks. It enables researchers to seamlessly address protein data with cutting-edge deep learning models. To assess model performance, we establish a benchmark evaluating different deep learning architectures across multiple protein-related tasks, including protein function prediction, subcellular localization prediction, protein-protein interaction prediction, and protein structure prediction. Furthermore, we introduce DeepProt-T5, a series of fine-tuned Prot-T5-based models that achieve state-of-the-art performance on four benchmark tasks, while demonstrating competitive results on six of others.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
MethodsAttention Is All You Need · Laplacian EigenMap · Dense Connections · Adam · Linear Layer · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Laplacian Positional Encodings · Dropout
