Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions
Jiayang Chen, Zhihang Hu, Siqi Sun, Qingxiong Tan, Yixuan Wang, Qinze, Yu, Licheng Zong, Liang Hong, Jin Xiao, Tao Shen, Irwin King, Yu Li

TL;DR
This paper introduces RNA-FM, a self-supervised foundation model trained on 23 million unannotated RNA sequences, significantly improving predictions of RNA structure and function without requiring labeled data.
Contribution
The paper presents a novel RNA foundation model trained via self-supervised learning on massive unannotated data, enabling accurate structural and functional predictions.
Findings
RNA-FM infers sequential and evolutionary information without labels.
RNA-FM improves accuracy in RNA structure and function predictions.
RNA-FM demonstrates versatility across multiple biological tasks.
Abstract
Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations. These are all among the core problems in the RNA field. With the rapid growth of sequencing technology, we have accumulated a massive amount of unannotated RNA sequences. On the other hand, expensive experimental observatory results in only limited numbers of annotated data and 3D structures. Hence, it is still challenging to design computational methods for predicting their structures and functions. The lack of annotated data and systematic study causes inferior performance. To resolve the issue, we propose a novel RNA foundation model (RNA-FM) to take advantage of all the 23 million non-coding RNA sequences through self-supervised learning. Within this approach, we discover that the pre-trained RNA-FM could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Machine Learning in Bioinformatics · RNA modifications and cancer
