Robust Optimization in Protein Fitness Landscapes Using Reinforcement   Learning in Latent Space

Minji Lee; Luiz Felipe Vecchietti; Hyunkyu Jung; Hyun Joo Ro; Meeyoung; Cha; Ho Min Kim

arXiv:2405.18986·cs.LG·May 30, 2024·1 cites

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung, Hyun Joo Ro, Meeyoung, Cha, Ho Min Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces LatProtRL, a reinforcement learning-based method operating in a learned latent space to optimize protein sequences, effectively improving their fitness and surpassing baseline methods.

Contribution

The paper presents a novel reinforcement learning approach in latent space for protein optimization, enabling efficient navigation and high-fitness sequence generation.

Findings

01

Achieves comparable or superior fitness to baseline methods.

02

Sequences generated reach high-fitness regions.

03

Potential for lab-in-the-loop protein optimization scenarios.

Abstract

Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space learned by an encoder-decoder leveraging a large protein language model. To escape local optima, our optimization is modeled as a Markov decision process using reinforcement learning acting directly in latent space. We evaluate our approach on two important fitness optimization tasks, demonstrating its ability to achieve comparable or superior fitness over baseline methods. Our findings and in vitro evaluation show that the generated sequences can reach high-fitness regions, suggesting a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haewonc/latprotrl
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification