# DRFormer: A Benchmark Model for RNA Sequence Downstream Tasks

**Authors:** Jianqi Fu, Haohao Li, Yanlei Kang, Hancan Zhu, Tiren Huang, Zhong Li

PMC · DOI: 10.3390/genes16030284 · Genes · 2025-02-26

## TL;DR

DRFormer is a new model for RNA analysis that uses structural features and multimodal learning to improve prediction accuracy across various RNA tasks.

## Contribution

DRFormer is the first RNA model to integrate structural vision features with sequence data in a multimodal framework.

## Key findings

- Achieved 94.4% MCC in RNA sequence classification, surpassing RNAErnie by 1.2%.
- Outperformed BERT-RBP and PrismNet in protein–RNA interaction prediction with an MCC of 0.492.
- Exceeded SPOT-RNA by 1% in RNA secondary structure prediction with an F1 score of 0.690.

## Abstract

Background/Objectives: RNA research is critical for understanding gene regulation, disease mechanisms, and therapeutic development. Constructing effective RNA benchmark models for accurate downstream analysis has become a significant research challenge. The objective of this study is to propose a robust benchmark model, DRFormer, for RNA sequence downstream tasks. Methods: The DRFormer model utilizes RNA sequences to construct novel vision features based on secondary structure and sequence distance. These features are pre-trained using the SWIN model to develop a SWIN-RNA submodel. This submodel is then integrated with an RNA sequence model to construct a multimodal model for downstream analysis. Results: We conducted experiments on various RNA downstream tasks. In the sequence classification task, the MCC reached 94.4%, surpassing the state-of-the-art RNAErnie model by 1.2%. In the protein–RNA interaction prediction, DRFormer achieved an MCC of 0.492, outperforming advanced models like BERT-RBP and PrismNet. In RNA secondary structure prediction, the F1 score was 0.690, exceeding the widely used SPOT-RNA model by 1%. Additionally, generalization experiments on DNA tasks yielded satisfactory results. Conclusions: DRFormer is the first RNA sequence downstream analysis model that leverages structural features to construct a vision model and integrates sequence and vision models in a multimodal manner. This approach yields excellent prediction and analysis results, making it a valuable contribution to RNA research.

## Full-text entities

- **Genes:** F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, AUH (AU RNA binding methylglutaconyl-CoA hydratase) [NCBI Gene 549], CPD (carboxypeptidase D) [NCBI Gene 1362] {aka GP180}, CPSF3 (cleavage and polyadenylation specific factor 3) [NCBI Gene 51692] {aka CPSF-73, CPSF73, NEDMHS, NEDMHSN}, SND1 (staphylococcal nuclease and tudor domain containing 1) [NCBI Gene 27044] {aka TDRD11, TSN, Tudor-SN, p100}, CDC40 (cell division cycle 40) [NCBI Gene 51362] {aka EHB3, PCH15, PRP17, PRPF17}, RSS [NCBI Gene 140821], SUGP1 (SURP and G-patch domain containing 1) [NCBI Gene 57794] {aka F23858, RBP, SF4}, ATXN2 (ataxin 2) [NCBI Gene 6311] {aka ATX2, SCA2, TNRC13}
- **Diseases:** DIS (MESH:C535290), injury to (MESH:D014947)
- **Chemicals:** dUTP (MESH:C027078), DRFormer (-), TFP (MESH:D014268)
- **Species:** Homo sapiens (human, species) [taxon 9606], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Mus musculus (house mouse, species) [taxon 10090]
- **Cell lines:** HEK293 — Homo sapiens (Human), Transformed cell line (CVCL_0045), HEK293T — Homo sapiens (Human), Transformed cell line (CVCL_0063), HepG2 — Homo sapiens (Human), Hepatoblastoma, Cancer cell line (CVCL_0027), K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11942477/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11942477/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC11942477/full.md

---
Source: https://tomesphere.com/paper/PMC11942477