# A deep learning framework for lysine 2-hydroxyisobutyrylation site prediction using evolutionary feature representation

**Authors:** Heba M. Elreify, Fathi E. Abd El-Samie, Moawad I. Dessouky, Hanaa Torkey, Said E. El-Khamy, Wafaa A. Shalaby

PMC · DOI: 10.1038/s41598-025-15883-z · Scientific Reports · 2025-11-06

## TL;DR

This paper introduces BLOS-Khib, a deep learning tool that predicts lysine 2-hydroxyisobutyrylation sites using evolutionary features, showing strong performance across multiple species.

## Contribution

BLOS-Khib is a novel deep learning framework using BLOSUM62 for cross-species prediction of Khib sites with improved accuracy.

## Key findings

- BLOS-Khib achieved high AUC values across six species, including 0.913 for human and 0.903 for Botrytis cinerea.
- The optimal peptide length for prediction was found to be 43 amino acids.
- The model showed high cross-species transferability, indicating convergent evolution of Khib determinants.

## Abstract

Lysine 2-hydroxyisobutyrylation (Khib) has emerged as a crucial Post-Translational Modification (PTM) with significant roles in diverse biological processes ranging from gene expression to metabolic regulation. Despite its importance, computational approaches for accurately predicting Khib sites remain limited. This study introduces BLOS-Khib, a deep-learning framework that utilizes evolutionary information encoded in the BLOSUM62 matrix within a Convolutional Neural Network (CNN) architecture for cross-species Khib site prediction. Through systematic optimization, we found that a 43-amino acid peptide length captures the optimal sequence context for prediction across six taxonomically diverse organisms. Comprehensive comparative analyses demonstrated BLOS-Khib competitive performance compared to existing methods, achieving notable Area Under the ROC Curve (AUC) values on independent test sets: human (0.913), wheat (0.892), T. gondii (0.893), rice (0.887), Candida albicans (0.885), and Botrytis cinerea (0.903). Our framework showed improved performance compared to state-of-the-art approaches, including traditional machine learning classifiers and alternative deep learning architectures. Sequence signature analysis revealed both conserved lysine-rich regions preceding modification sites and species-specific amino acid preferences at positions immediately flanking the target residue. Notably, our cross-species applicability experiments identified high transferability between evolutionarily distant organisms, ensuring the potential convergent evolution of Khib determinants. BLOS-Khib demonstrates competitive performance for PTM prediction, while providing evolutionary insights into the sequence determinants governing this emerging regulatory mechanism across diverse species.

The online version contains supplementary material available at 10.1038/s41598-025-15883-z.

## Linked entities

- **Species:** Homo sapiens (taxon 9606), Candida albicans (taxon 5476), Botrytis cinerea (taxon 40559)

## Full-text entities

- **Chemicals:** Lysine (MESH:D008239), Khib (-)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530], Homo sapiens (human, species) [taxon 9606], Candida albicans (species) [taxon 5476], Toxoplasma gondii (species) [taxon 5811], Botrytis cinerea (gray fruit mold, species) [taxon 40559]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12592433/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12592433/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/PMC12592433/full.md

---
Source: https://tomesphere.com/paper/PMC12592433