Adaptive machine learning for protein engineering

Brian L. Hie; Kevin K. Yang

arXiv:2106.05466·q-bio.QM·July 7, 2021

Adaptive machine learning for protein engineering

Brian L. Hie, Kevin K. Yang

PDF

TL;DR

This paper reviews adaptive machine learning strategies for protein engineering, focusing on sequence selection and optimization across multiple rounds to efficiently discover functional proteins.

Contribution

It introduces a framework for using sequence-to-function surrogate models in adaptive optimization, enhancing protein design efficiency.

Findings

01

Single-round sequence selection improves experimental efficiency.

02

Sequential optimization enhances discovery of optimized proteins.

03

Iterative model training accelerates protein engineering process.

Abstract

Machine-learning models that learn from data to predict how protein sequence encodes function are emerging as a useful protein engineering tool. However, when using these models to suggest new protein designs, one must deal with the vast combinatorial complexity of protein sequences. Here, we review how to use a sequence-to-function machine-learning surrogate model to select sequences for experimental measurement. First, we discuss how to select sequences through a single round of machine-learning optimization. Then, we discuss sequential optimization, where the goal is to discover optimized sequences and improve the model across multiple rounds of training, optimization, and experimental measurement.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.