RiboPO: Preference Optimization for Structure- and Stability-Aware RNA Design
Minghao Sun, Hanqun Cao, Zhou Zhang, Chen Wei, Liang Wang, Tianrui Jia, Zhiyuan Liu, Tianfan Fu, Xiangru Tang, Yejin Choi, Pheng-Ann Heng, Fang Wu, Yang Zhang

TL;DR
RiboPO introduces a reinforcement learning framework that optimizes RNA sequences for accurate 3D structure, stability, and global geometry, outperforming existing methods on key benchmarks.
Contribution
It presents RiboPO, a novel preference optimization approach using reinforcement learning from physical feedback for multi-objective RNA design.
Findings
Improves MFE by 12.3% over baselines
Increases secondary-structure self-consistency by 20%
Achieves 11% higher pass@64 sampling efficiency
Abstract
Designing RNA sequences that reliably adopt specified three-dimensional structures while maintaining thermodynamic stability remains challenging for synthetic biology and therapeutics. Current inverse folding approaches optimize for sequence recovery or single structural metrics, failing to simultaneously ensure global geometry, local accuracy, and ensemble stability-three interdependent requirements for functional RNA design. This gap becomes critical when designed sequences encounter dynamic biological environments. We introduce RiboPO, a Ribonucleic acid Preference Optimization framework that addresses this multi-objective challenge through reinforcement learning from physical feedback (RLPF). RiboPO fine-tunes gRNAde by constructing preference pairs from composite physical criteria that couple global 3D fidelity and thermodynamic stability. Preferences are formed using structural…
Peer Reviews
Decision·Submitted to ICLR 2026
1. Introducing multi-objective optimization into RNA design is a meaningful and timely direction. 2. RiboPO demonstrates improved secondary-structure consistency and thermodynamic metrics.
1. The paper introduces a highly complicated evaluation framework, SSTT (Section 3.4), which consists of 15 different metrics. However, most of these metrics are not actually used in the model optimization process. Preference construction relies only on pLDDT, RMSD, and MFE, and the reported results primarily focus on a small subset of about seven metrics. As a result, the necessity and practical value of introducing such a complex evaluation framework are unclear. 2. The central motivation of t
- The problem being tackled is extremely significant, as RNA inverse design methods are becoming practically used. The paper identifies a key research gap and tackles the problem of RL-driven property optimization in RNA design very well. I particularly want to commend authors’ efforts to develop techniques that are not just copying what is done in the proteins-ML realm, but to do something that’s original and RNA-specific. To the best of my knowledge, this is the first work to tackle this impor
Though not necessarily a weakness of this paper alone, the quality of RNA 3D structure prediction is pretty poor at the moment. Thus, its not surprising that the method does not yet lead to significant gains in terms of 3D metrics over gRNAde, as the structure predictor being used is not reliable. I believe that upcoming 3D structure predictors could push the state of the art further, and then further improve RiboPO as well. Other than these, I do not see any major weaknesses worth highlighting
* Conceptual novelty: Reformulating RNA inverse folding as multi-objective preference optimization is conceptually elegant and provides a unifying view linking structural and thermodynamic optimization. * Well-analyzed framework: The round-wise preference construction and curriculum-based DPO training are systematically ablated, with clear evidence of trade-offs among objectives. * Comprehensive evaluation: The SSTT benchmark covers geometric, energetic, and sequence-level properties in a single
## Lack of visual RNA design analysis: The paper does not include any visual examples of designed RNA structures (e.g., 2D secondary structure plots or 3D conformational overlays). In RNA design literature (e.g., RiboDiffusion, RDesign, RhoDesign), such visualizations are essential to demonstrate whether generated sequences structurally resemble the target folds. The authors should provide a figure comparing RiboPO’s designs with ground truth (e.g., native vs. designed structure overlays) and di
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · DNA and Nucleic Acid Chemistry · RNA modifications and cancer
