Reinforcement-guided generative protein language models enable de novo design of highly diverse AAV capsids
Lucas Ferraz, Ana F. Rodrigues, Pedro Giesteira Cotovio, Mafalda Ventura, Gabriela Silva, Ana Sofia Coroadinha, Miguel Machuqueiro, Catia Pesquita

TL;DR
This paper introduces a reinforcement learning-guided generative framework using protein language models to design highly diverse and viable AAV capsid sequences, expanding the exploration of protein sequence space for gene therapy applications.
Contribution
The study develops a novel reinforcement learning-based method combined with protein language models for de novo AAV capsid design, enabling exploration beyond the training data while maintaining viability.
Findings
Reinforcement learning guides generation towards more novel sequences.
Fine-tuning alone biases sequences towards known data.
The framework effectively balances viability and novelty in candidate selection.
Abstract
Adeno-associated viral (AAV) vectors are widely used delivery platforms in gene therapy, and the design of improved capsids is key to expanding their therapeutic potential. A central challenge in AAV bioengineering, as in protein design more broadly, is the vast sequence design space relative to the scale of feasible experimental screening. Machine-guided generative approaches provide a powerful means of navigating this landscape and proposing novel protein sequences that satisfy functional constraints. Here, we develop a generative design framework based on protein language models and reinforcement learning to generate highly novel yet functionally plausible AAV capsids. A pretrained model was fine-tuned on experimentally validated capsid sequences to learn patterns associated with viability. Reinforcement learning was then used to guide sequence generation, with a reward function that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVirus-based gene therapy research · Monoclonal and Polyclonal Antibodies Research · Transgenic Plants and Applications
