Protein Language Models and Structure Prediction: Connection and Progression
Bozhen Hu, Jun Xia, Jiangbin Zheng, Cheng Tan, Yufei Huang, Yongjie, Xu, Stan Z. Li

TL;DR
This paper reviews recent progress in protein language models and their application to protein structure prediction, highlighting methodologies, advancements, and future research directions in the field.
Contribution
It provides a systematic survey connecting NLP-inspired language models with protein structure prediction, bridging gaps and summarizing recent developments and challenges.
Findings
pLM-based pipelines are now mainstream in PSP
Recent advances have improved tertiary structure prediction accuracy
The survey identifies key challenges and future directions in the field
Abstract
The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding. Recent advances have proved the power of language models (LMs) in processing the protein sequence databases, which inherit the advantages of attention networks and capture useful information in learning representations for proteins. The past two years have witnessed remarkable success in tertiary protein structure prediction (PSP), including evolution-based and single-sequence-based PSP. It seems that instead of using energy-based models and sampling procedures, protein language model (pLM)-based pipelines have emerged as mainstream paradigms in PSP. Despite the fruitful progress, the PSP community needs a systematic and up-to-date survey to help bridge the gap between LMs in the natural language processing (NLP) and PSP domains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Machine Learning in Materials Science
