Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models
Yuchi Qiu, Guo-Wei Wei

TL;DR
This paper reviews how topological data analysis and deep protein language models, powered by AI, are transforming protein engineering by handling vast mutational spaces and improving structure prediction.
Contribution
It provides a comprehensive overview of integrating TDA and NLP techniques with AI for advancing protein engineering methodologies.
Findings
TDA enhances understanding of protein structures.
Deep protein language models accelerate mutational analysis.
AI-based structure prediction improves engineering accuracy.
Abstract
Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
