Machine learning-assisted directed protein evolution with combinatorial libraries
Zachary Wu, S. B. Jennifer Kan, Russell D. Lewis, Bruce J. Wittmann,, Frances H. Arnold

TL;DR
This paper demonstrates how machine learning can accelerate directed protein evolution by efficiently exploring combinatorial sequence space, leading to higher fitness variants and enabling stereodivergent enzyme evolution with high enantioselectivity.
Contribution
It introduces a machine learning-guided approach for directed protein evolution that improves efficiency and success rate over traditional methods.
Findings
Machine learning models successfully predicted high-fitness protein variants.
The approach identified enzyme variants with 93% and 79% enantiomeric excess.
Machine learning increased throughput and diversity in protein engineering.
Abstract
To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning in the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine learning models trained on tested variants provide a fast method for testing sequence space computationally. We validate this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (stereodivergence) of a new-to-nature carbene Si-H insertion reaction. The approach predicted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
