ProteinGuide: On-the-fly property guidance for protein sequence generative models
Junhao Xiong, Ishan Gaur, Maria Lukarska, Hunter Nisonoff, Luke M. Oltrogge, David F. Savage, Jennifer Listgarten

TL;DR
ProteinGuide introduces a versatile on-the-fly conditioning framework for various protein sequence generative models, enabling targeted protein design based on specific properties without retraining the models.
Contribution
It provides a unifying statistical framework for on-the-fly conditioning of diverse protein generative models, demonstrated through in silico and in vivo experiments.
Findings
Successfully designed proteins with desired properties such as stability and activity.
Optimized multiple properties simultaneously in protein design.
Enhanced base editing efficiency in vivo with a single round of guidance.
Abstract
Sequence generative models are transforming protein engineering. However, no principled framework exists for conditioning these models on auxiliary information, such as experimental data, without additional training of a generative model. Herein, we present ProteinGuide, a method for such "on-the-fly" conditioning, amenable to a broad class of protein generative models including Masked Language Models (e.g. ESM3), any-order auto-regressive models (e.g. ProteinMPNN) as well as diffusion and flow matching models (e.g. MultiFlow). ProteinGuide stems from our unifying view of these model classes under a single statistical framework. As proof of principle, we perform several in silico experiments. We first guide pre-trained generative models to design proteins with user-specified properties, such as higher stability or activity. Next, we design for optimizing two desired properties that are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCRISPR and Genetic Engineering · Genomics and Rare Diseases · RNA and protein synthesis mechanisms
MethodsBalanced Selection · Diffusion
