Scoring-Assisted Generative Exploration for Proteins (SAGE-Prot): A Framework for Multi-Objective Protein Optimization via Iterative Sequence Generation and Evaluation
Hocheol Lim, Geon-Ho Lee, and Kyoung Tai No

TL;DR
SAGE-Prot is a novel iterative framework that combines generative models and property predictors to optimize protein sequences for multiple desired traits, significantly improving protein functions like binding and enzymatic activity.
Contribution
It introduces an integrated, iterative approach that leverages generative models and property evaluation to efficiently optimize proteins across multiple objectives.
Findings
Achieved up to 17-fold increase in enzyme activity.
Effectively optimized proteins for multiple properties.
Demonstrated rapid adaptation to complex design goals.
Abstract
Proteins play essential roles in nature, from catalyzing biochemical reactions to binding specific targets. Advances in protein engineering have the potential to revolutionize biotechnology and healthcare by designing proteins with tailored properties. Machine learning and generative models have transformed protein design by enabling the exploration of vast sequence-function landscapes. Here, we introduce Scoring-Assisted Generative Exploration for Proteins (SAGE-Prot), a framework that iteratively combines autoregressive protein generation with quantitative structure-property relationship models for fine-tuned optimization. By integrating diverse protein descriptors, SAGE-Prot enhances key properties, including binding affinity, thermal stability, enzymatic activity, and solubility. We demonstrate its effectiveness by optimizing GB1 for binding affinity and thermal stability and TEM-1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Genetics, Bioinformatics, and Biomedical Research
