Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models
Po-Yu Liang, Xueting Huang, Tibo Duran, Andrew J. Wiemer, Jun Bai

TL;DR
This paper introduces a novel autoencoder-based approach leveraging protein language models to generate peptide analogs from a single sequence, significantly improving similarity and bioactivity predictions without large datasets.
Contribution
The study presents a new method that explores protein embedding space for peptide analog generation using minimal data, outperforming baseline models in key metrics.
Findings
Improved similarity indicators for peptide structures and bioactivities.
Generated peptide analogs with similar yet distinct properties.
Validated approach through Molecular Dynamics simulations on TIGIT inhibitors.
Abstract
Generating peptides with desired properties is crucial for drug discovery and biotechnology. Traditional sequence-based and structure-based methods often require extensive datasets, which limits their effectiveness. In this study, we proposed a novel method that utilized autoencoder shaped models to explore the protein embedding space, and generate novel peptide analogs by leveraging protein language models. The proposed method requires only a single sequence of interest, avoiding the need for large datasets. Our results show significant improvements over baseline models in similarity indicators of peptide structures, descriptors and bioactivities. The proposed method validated through Molecular Dynamics simulations on TIGIT inhibitors, demonstrates that our method produces peptide analogs with similar yet distinct properties, highlighting its potential to enhance peptide screening…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Biomedical Text Mining and Ontologies · Topic Modeling
