Guiding Generative Protein Language Models with Reinforcement Learning

Filippo Stocco; Maria Artigues-Lleixa; Andrea Hunklinger; Talal Widatalla; Marc Guell; Noelia Ferruz

arXiv:2412.12979·q-bio.BM·December 1, 2025·6 cites

Guiding Generative Protein Language Models with Reinforcement Learning

Filippo Stocco, Maria Artigues-Lleixa, Andrea Hunklinger, Talal Widatalla, Marc Guell, Noelia Ferruz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reinforcement learning framework to guide protein language models in designing proteins with specific desired properties, significantly improving their ability to generate high-fitness variants efficiently.

Contribution

It presents a novel method combining reinforcement learning with protein language models to steer their outputs toward user-defined objectives, enabling rapid and targeted protein design.

Findings

01

Achieved a 26-fold increase in EGFR binder affinity in two iterations.

02

Successfully guided pLMs toward various protein properties such as topology and binding affinity.

03

Demonstrated efficient design with few iterations through evolutionary trajectories.

Abstract

Protein language models (pLMs) have demonstrated success at generating functional proteins across vast sequence spaces but lack the ability to design high-fitness variants on demand. Here, we iteratively guide pLMs toward user-defined objectives by applying reinforcement learning (RL). We demonstrate that RL can steer pLMs toward various protein properties, such as topologies or binding affinities, in a few iterations through long evolutionary trajectories. We apply our framework to the design of epidermal growth factor receptor (EGFR) binders, achieving a 26-fold increase in binding affinity in two iterations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ai4pdlab/dpo_plm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Bioinformatics · Natural Language Processing Techniques