Designing RNAs with Language Models

Milan Gautam; Ning Dai; Tianshuo Zhou; Bowen Xie; David Mathews; Liang Huang

arXiv:2602.12470·cs.LG·February 16, 2026

Designing RNAs with Language Models

Milan Gautam, Ning Dai, Tianshuo Zhou, Bowen Xie, David Mathews, Liang Huang

PDF

Open Access

TL;DR

This paper introduces a novel approach to RNA design using autoregressive language models, framing it as a conditional sequence generation task, which outperforms traditional methods in efficiency and accuracy.

Contribution

The authors propose a new neural language model-based method for RNA design, combining supervised training and reinforcement learning to improve performance and scalability.

Findings

01

Outperforms state-of-the-art on key metrics

02

Achieves 1.7x faster design process

03

Effective across multiple datasets

Abstract

RNA design, the task of finding a sequence that folds into a target secondary structure, has broad biological and biomedical impact but remains computationally challenging due to the exponentially large sequence space and exponentially many competing folds. Traditional approaches treat it as an optimization problem, relying on per-instance heuristics or constraint-based search. We instead reframe RNA design as conditional sequence generation and introduce a reusable neural approximator, instantiated as an autoregressive language model (LM), that maps target structures directly to sequences. We first train our model in a supervised setting on random-induced structure-sequence pairs, and then use reinforcement learning (RL) to optimize end-to-end metrics. We also propose methods to select a small subset for RL that greatly improves RL efficiency and quality. Across four datasets, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRNA and protein synthesis mechanisms · Machine Learning in Materials Science · RNA Interference and Gene Delivery