# HD-6mAPred: a hybrid deep learning approach for accurate prediction of N6-methyladenine sites in plant species

**Authors:** Huimin Li, Wei Gao, Yi Tang, Xiaotian Guo

PMC · DOI: 10.7717/peerj.19463 · PeerJ · 2025-05-15

## TL;DR

This paper introduces HD-6mAPred, a deep learning model that accurately predicts N6-methyladenine sites in plants, outperforming existing methods across three species.

## Contribution

HD-6mAPred combines BiGRU, CNN, and attention mechanisms with multiple DNA encoding strategies to improve 6mA site prediction accuracy and generalization.

## Key findings

- HD-6mAPred achieved 0.996 accuracy and 0.993 MCC in predicting 6mA sites in Rosaceae.
- The model showed strong performance in rice and Arabidopsis with high accuracy and specificity.
- HD-6mAPred outperforms existing methods and generalizes well across different plant species.

## Abstract

N6-methyladenine (6mA) is an important DNA methylation modification that serves a crucial function in various biological activities. Accurate prediction of 6mA sites is essential for elucidating its biological function and underlying mechanism. Although existing methods have achieved great success, there remains a pressing need for improved prediction accuracy and generalization cap ability across diverse species. This study aimed to develop a robust method to address these challenges.

We proposed HD-6mAPred, a hybrid deep learning model that combines bidirectional gated recurrent unit (BiGRU), convolutional neural network (CNN) and attention mechanism, along with various DNA sequence coding schemes. Firstly, DNA sequences were encoded using four different ways: one-hot encoding, electron-ion interaction pseudo-potential (EIIP), enhanced nucleic acid composition (ENAC) and nucleotide chemical properties (NCP). Secondly, a hold-out search strategy was employed to identify the optimal features or feature combinations for both BiGRU and CNN. Finally, the attention mechanism was introduced to weigh the importance of features derived from the BiGRU and CNN.

A series of experiments on the Rosaceae, rice and Arabidopsis datasets were conducted to demonstrate the superiority of HD-6mAPred. In Rosaceae, the HD-6mAPred model achieved excellent performance: accuracy (ACC) of 0.996, Matthew correlation coefficient (MCC) of 0.993, sensitivity (SN) and specificity (SP) of 0.995 and 0.998, respectively. In rice, the evaluation metrics are 0.952 (ACC), 0.905 (MCC), 0.955 (SN), and 0.949 (SP). In Arabidopsis, the corresponding metrics are 0.937 (ACC), 0.875 (MCC), 0.927 (SN), and 0.948 (SP). Compared to existing methods, these results demonstrate that HD-6mAPred achieves state-of-the-art performance in predicting 6mA sites across three plant species. Furthermore, HD-6mAPred not only improves the accuracy of 6mA site prediction, but also shows excellent generalization capability across species. The source code utilized in this study is publicly accessible at https://doi.org/10.5281/zenodo.15355131.

## Linked entities

- **Species:** Rosaceae (taxon 3745), Arabidopsis (taxon 3701)

## Full-text entities

- **Chemicals:** 6mA (-), N6-methyladenine (MESH:C005955), nucleotide (MESH:D009711)
- **Species:** Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Oryza sativa (Asian cultivated rice, species) [taxon 4530]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12085883/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12085883/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12085883/full.md

---
Source: https://tomesphere.com/paper/PMC12085883