# Precise prediction of hotspot residues in protein–RNA complexes using graph attention networks and pretrained protein language models

**Authors:** Siyuan Shen, Jie Chen, Zhijian Huang, Yuanpeng Zhang, Ziyu Fan, Yuting Kong, Lei Deng

PMC · DOI: 10.1093/bioinformatics/btaf197 · 2025-07-15

## TL;DR

This paper introduces DeepHotResi, a new method that uses machine learning to accurately predict hotspot residues in protein-RNA complexes without needing detailed structural data.

## Contribution

The novel contribution is a sequence-based computational method using graph attention networks and pretrained protein language models to predict hotspot residues in protein-RNA complexes.

## Key findings

- DeepHotResi outperforms existing methods in predicting hotspot residues in protein-RNA complexes.
- The method achieves superior accuracy on test datasets without requiring high-resolution structural data.
- Integration of the Squeeze-and-Excitation module improves feature representation for hotspot prediction.

## Abstract

Protein–RNA interactions play a pivotal role in biological processes and disease mechanisms, with hotspot residues being critical for targeted drug design. Traditional experimental methods for identifying hotspot residues are often inefficient and expensive. Moreover, many existing prediction methods rely heavily on high-resolution structural data, which may not always be available. Consequently, there is an urgent need for an accurate and efficient sequence-based computational approach for predicting hotspot residues in protein–RNA complexes.

In this study, we introduce DeepHotResi, a sequence-based computational method designed to predict hotspot residues in protein–RNA complexes. DeepHotResi leverages a pretrained protein language model to predict protein structure and generate an amino acid contact map. To enhance feature representation, DeepHotResi integrates the Squeeze-and-Excitation (SE) module, which processes diverse amino acid-level features. Next, it constructs an amino acid feature network from the contact map and SE-module-derived features. Finally, DeepHotResi employs a graph attention network to model hotspot residue prediction as a graph node classification task. Experimental results demonstrate that DeepHotResi outperforms state-of-the-art methods, effectively identifying hotspot residues in protein–RNA complexes with superior accuracy on the test set.

The source code and dataset are available at https://github.com/Q1DT/DeepHotResi.

## Full-text entities

- **Genes:** SQLE (squalene epoxidase) [NCBI Gene 6713]
- **Diseases:** SE-NET (MESH:D011595), PSSM (MESH:C562465), neurological disorders (MESH:D009461), HMM (MESH:D004195), cancer (MESH:D009369)
- **Chemicals:** HMM (-), alanine (MESH:D000409), carbon (MESH:D002244)
- **Cell lines:** ESM-2 — Mus musculus (Mouse), Mouse melanoma, Cancer cell line (CVCL_B0CH)

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12261489/full.md

---
Source: https://tomesphere.com/paper/PMC12261489