# diPaRIS: Dynamic and Interpretable Protein‐RNA Interactions Prediction With U‐Shaped Network and Novel Structure Encoding

**Authors:** Lishen Zhang, Chengqian Lu, Xiaoqing Peng, Fei Guo, Hongdong Li, Jianxin Wang

PMC · DOI: 10.1002/advs.202506314 · Advanced Science · 2025-08-29

## TL;DR

diPaRIS is a deep learning method that improves predictions of dynamic protein-RNA interactions using RNA structure data, offering better accuracy and interpretability.

## Contribution

diPaRIS introduces a novel encoding scheme for RNA structures and a U-shaped network to enhance prediction accuracy and interpretability of protein-RNA interactions.

## Key findings

- diPaRIS outperforms existing methods in accuracy, AUC, AUPR, and F1-score across 44 datasets.
- The model excels in cross-cell line predictions and provides interpretable binding motifs and attribution maps.
- diPaRIS predictions help interpret gene-disease associations and RNA binding conservation.

## Abstract

Protein‐RNA interactions play a critical role in various biological processes and disease development. Proteins interact with RNAs through dynamic binding sites that exhibit specific structural patterns under various cellular conditions. While current computational methods take RNA structures in vivo into account, they fall short in capturing the structure contextual association of nucleotides, limiting predictive accuracy. Here, diPaRIS, a deep learning method, is proposed to predict dynamic protein‐RNA interactions with improved accuracy and enhanced interpretability by integrating RNA structures in vivo. diPaRIS introduces a novel encoding scheme for SHAPE‐seq to encode nucleotide correlations, providing a more comprehensive representation of RNA structures. Leveraging a U‐shaped network architecture, diPaRIS not only improves prediction performance but also enables interpretable analysis by learning sequence binding motifs and generating attribution maps. Benchmarking diPaRIS across 44 datasets shows its superiority over existing methods. The model consistently achieves the highest accuracy, AUC, AUPR, and F1‐score across all datasets. Additionally, diPaRIS excels in cross‐cell line predictions, consistently outperforming the second‐best method across all datasets. Predictions by diPaRIS reflect the conservation of protein‐RNA binding and facilitate further functional interpretation of genetic variants in complex diseases. The findings highlight that diPaRIS effectively predicts protein‐RNA interactions and interprets potential gene‐disease associations.

This study presents diPaRIS, a deep learning method designed to predict dynamic protein‐ RNA interactions with enhanced accuracy. By integrating RNA structures in vivo through a novel encoding scheme for nucleotide correlations, diPaRIS improves the interpretability of binding motifs and outperforms current state‐of‐the‐art methods across multiple datasets. The results provide new insights into protein‐RNA binding preferences and their role in disease development.

## Full-text entities

- **Genes:** GHR (growth hormone receptor) [NCBI Gene 2690] {aka GHBP, GHIP}, DDX3X (DEAD-box helicase 3 X-linked) [NCBI Gene 1654] {aka CAP-Rf, DBX, DDX14, DDX3, HLP2, MRX102}, SF3B4 (splicing factor 3b subunit 4) [NCBI Gene 10262] {aka AFD1, Hsh49, SAP49, SF3b49}, UPF1 (UPF1 RNA helicase and ATPase) [NCBI Gene 5976] {aka HUPF1, NORF1, RENT1, UTF, pNORF1, smg-2}, RPS3 (ribosomal protein S3) [NCBI Gene 6188] {aka S3, uS3}, DNAH8 (dynein axonemal heavy chain 8) [NCBI Gene 1769] {aka ATPase, SPGF46, hdhc9}, FLT3 (fms related receptor tyrosine kinase 3) [NCBI Gene 2322] {aka CD135, FLK-2, FLK2, STK1}, BCLAF1 (BCL2 associated transcription factor 1) [NCBI Gene 9774] {aka BTF, bK211L9.1}, IGF2 (insulin like growth factor 2) [NCBI Gene 3481] {aka C11orf43, GRDF, IGF-II, PP9974, SRS3}, DDX24 (DEAD-box helicase 24) [NCBI Gene 57062], SMG7 (SMG7 nonsense mediated mRNA decay factor) [NCBI Gene 9887] {aka C1orf16, EST1C, SGA56M}, EDC3 (enhancer of mRNA decapping 3) [NCBI Gene 80153] {aka LSM16, MRT50, YJDC, YJEFN2, hYjeF_N2-15q23}, DDX6 (DEAD-box helicase 6) [NCBI Gene 1656] {aka HLR2, IDDILF, P54, RCK, Rck/p54}, STAT5A (signal transducer and activator of transcription 5A) [NCBI Gene 6776] {aka MGF, STAT5}, DCP1A (decapping mRNA 1A) [NCBI Gene 55802] {aka HSA275986, Nbla00360, SMAD4IP1, SMIF}, TGFB1 (transforming growth factor beta 1) [NCBI Gene 7040] {aka CAEND1, CED, DPD1, IBDIMDE, LAP, TGF-beta1}, PUM1 (pumilio RNA binding family member 1) [NCBI Gene 9698] {aka HSPUM, NEDMSF, PUMH, PUMH1, PUML1, SCA47}, SMG5 (SMG5 nonsense mediated mRNA decay factor) [NCBI Gene 23381] {aka EST1B, LPTS-RP1, LPTSRP1, SMG-5}, ZNF800 (zinc finger protein 800) [NCBI Gene 168850], AGO1 (argonaute RISC component 1) [NCBI Gene 26523] {aka EIF2C, EIF2C1, GERP95, NEDLBAS, Q99, hAgo1}, DDX19A (DEAD-box helicase 19A) [NCBI Gene 55308] {aka DDX19-DDX19L, DDX19L}, DCP2 (decapping mRNA 2) [NCBI Gene 167227] {aka NUDT20}, MFSD11 (major facilitator superfamily domain containing 11) [NCBI Gene 79157] {aka ET}, LARP4 (La ribonucleoprotein 4) [NCBI Gene 113251] {aka PP13296}, ADAR (adenosine deaminase RNA specific) [NCBI Gene 103] {aka ADAR1, AGS6, DRADA, DSH, DSRAD, G1P1}, IGF2BP3 (insulin like growth factor 2 mRNA binding protein 3) [NCBI Gene 10643] {aka CT98, IMP-3, IMP3, KOC, KOC1, VICKZ3}, IGF2BP1 (insulin like growth factor 2 mRNA binding protein 1) [NCBI Gene 10642] {aka CRD-BP, CRDBP, IMP-1, IMP1, VICKZ1, ZBP1}, RNPS1 (RNA binding protein with serine rich domain 1) [NCBI Gene 10921] {aka E5.1}, AKR1C3 (aldo-keto reductase family 1 member C3) [NCBI Gene 8644] {aka DD3, DDX, HA1753, HAKRB, HAKRe, HSD17B5}, IGF2BP2 (insulin like growth factor 2 mRNA binding protein 2) [NCBI Gene 10644] {aka IMP-2, IMP2, VICKZ2}
- **Diseases:** genetic diseases (MESH:D030342), cancer (MESH:D009369), hepatocellular carcinoma (MESH:D006528), leukemia (MESH:D007938), leukaemia (MESH:D015458), chronic myeloid leukemia (MESH:D015464), tumorigenesis (MESH:D063646)
- **Chemicals:** uracil (MESH:D014498), guanine (MESH:D006147), guanosine (MESH:D006151), cytosine (MESH:D003596), SNV (-), inosine (MESH:D007288), Adenine (MESH:D000225), Nucleotide (MESH:D009711)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** adenine is replaced by cytosine, cytosine is replaced by adenine
- **Cell lines:** K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004), PABPC4 — Homo sapiens (Human), Chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_TB84), S25 — Mus musculus (Mouse), Hybridoma (CVCL_G585), HEK293 — Homo sapiens (Human), Transformed cell line (CVCL_0045), HepG2 — Homo sapiens (Human), Hepatoblastoma, Cancer cell line (CVCL_0027), PCBP2 — Homo sapiens (Human), Colon carcinoma, Cancer cell line (CVCL_A628), PUM2 — Mus musculus (Mouse), Hybridoma (CVCL_KJ89)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12520504/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12520504/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/PMC12520504/full.md

---
Source: https://tomesphere.com/paper/PMC12520504