Efficient Prediction of Peptide Self-assembly through Sequential and   Graphical Encoding

Zihan Liu; Jiaqi Wang; Yun Luo; Shuang Zhao; Wenbin Li; Stan Z. Li

arXiv:2307.09169·q-bio.BM·July 19, 2023

Efficient Prediction of Peptide Self-assembly through Sequential and Graphical Encoding

Zihan Liu, Jiaqi Wang, Yun Luo, Shuang Zhao, Wenbin Li, Stan Z. Li

PDF

Open Access 1 Repo

TL;DR

This study systematically evaluates peptide encoding methods using advanced deep learning models on a large molecular dynamics dataset, significantly improving peptide self-assembly prediction accuracy and providing a benchmark for future peptide property predictions.

Contribution

It offers a comprehensive benchmark analysis of peptide encoding techniques with state-of-the-art deep learning models, highlighting Transformer as the most effective for peptide self-assembly prediction.

Findings

01

Transformer outperforms other models in peptide self-assembly prediction.

02

Peptide encoding as sequences and graphs significantly impacts prediction accuracy.

03

Decapeptides are effectively predicted using the proposed models.

Abstract

In recent years, there has been an explosion of research on the application of deep learning to the prediction of various peptide properties, due to the significant development and market potential of peptides. Molecular dynamics has enabled the efficient collection of large peptide datasets, providing reliable training data for deep learning. However, the lack of systematic analysis of the peptide encoding, which is essential for AI-assisted peptide-related tasks, makes it an urgent problem to be solved for the improvement of prediction accuracy. To address this issue, we first collect a high-quality, colossal simulation dataset of peptide self-assembly containing over 62,000 samples generated by coarse-grained molecular dynamics (CGMD). Then, we systematically investigate the effect of peptide encoding of amino acids into sequences and molecular graphs using state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zihan-liu-00/dl_for_peptide
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Bioinformatics · Chemical Synthesis and Analysis · Supramolecular Self-Assembly in Materials

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection · Absolute Position Encodings · Adam · Layer Normalization · Label Smoothing