Mathematics-assisted directed evolution and protein engineering
Yuchi Qiu, Guo-Wei Wei

TL;DR
This paper discusses how advanced topological data analysis techniques, especially persistent Laplacians, can enhance AI-assisted directed evolution for protein engineering by overcoming current limitations.
Contribution
It introduces persistent topological Laplacians as a new approach to improve structure-based embeddings in AI-assisted protein engineering.
Findings
Persistent Laplacians improve structure-based embeddings.
Topological methods outperform sequence-based approaches.
Mathematics-assisted deep learning shows promise for protein design.
Abstract
Directed evolution is a molecular biology technique that is transforming protein engineering by creating proteins with desirable properties and functions. However, it is experimentally impossible to perform the deep mutational scanning of the entire protein library due to the enormous mutational space, which scales as , where N is the number of amino acids. This has led to the rapid growth of AI-assisted directed evolution (AIDE) or AI-assisted protein engineering (AIPE) as an emerging research field. Aided with advanced natural language processing (NLP) techniques, including long short-term memory, autoencoder, and transformer, sequence-based embeddings have been dominant approaches in AIDE and AIPE. Persistent Laplacians, an emerging technique in topological data analysis (TDA), have made structure-based embeddings a superb option in AIDE and AIPE. We argue that a class of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Bioinformatics and Genomic Networks · Cell Image Analysis Techniques
MethodsLib
