# Prediction of gene expression using histone modification patterns extracted by Particle Swarm Optimization

**Authors:** Niels Benjamin Paul, Jonas Chanrithy Wolber, Malte Lennart Sahrhage, Tim Beißbarth, Martin Haubrock

PMC · DOI: 10.1093/bioinformatics/btaf033 · Bioinformatics · 2025-01-29

## TL;DR

This paper introduces a new algorithm called PatternChrome that uses histone modification patterns to predict gene expression more accurately than previous methods.

## Contribution

The novel contribution is using histone modification patterns, not just their abundance, to predict gene expression with a new optimization algorithm.

## Key findings

- PatternChrome achieved an AUC score of 0.9029, outperforming previous methods in predicting gene expression.
- Predictive histone modification patterns are generalizable across samples and largely independent of cellular specificity.
- The algorithm provides insights into how specific histone modifications affect transcription regulation.

## Abstract

Histone modifications play an important role in transcription regulation. Although the general importance of some histone modifications for transcription regulation has been previously established, the relevance of others and their interaction is subject to ongoing research. By training Machine Learning models to predict a gene’s expression and explaining their decision making process, we can get hints on how histone modifications affect transcription. In previous studies, trained models were either hardly explainable or the models were trained solely on the abundance of histone modifications. Based on other studies, which used histone modification patterns, rather than their abundance, to identify potential regulatory elements, we hypothesize the histone modification pattern in a gene’s promoter to be more predictive for gene expression. We used an optimization algorithm to extract predictive histone modification profiles.

Our algorithm called PatternChrome achieved an average area under curve (AUC) score of 0.9029 over 56 samples for binary classification, outperforming all previous algorithms for the same task. We explained the models decisions to deduce the effect of specific features, certain histone modifications or promoter positions on transcription regulation. Although the predictive histone modification patterns were extracted for each sample separately, they can be used to predict gene expression in other samples, implying that the created patterns are largely generalizable. Interestingly, the impact of histone modifications on gene regulation appears predominantly indifferent to cellular specificity. Through explanation of the classifier’s decisions, we substantiate established literature knowledge while concurrently revealing novel insights into the intricate landscape of transcriptional regulation via histone modification.

The code for the PatternChrome algorithm, the scripts for the analyses and the required data can be found at (https://gitlab.gwdg.de/MedBioinf/generegulation/patternchrome).

Graphical abstract

## Full-text entities

- **Genes:** POU5F1 (POU class 5 homeobox 1) [NCBI Gene 5460] {aka OCT3, OCT4, OCT4Borf1, OTF-3, OTF3, OTF4}, TIE1 (tyrosine kinase with immunoglobulin like and EGF like domains 1) [NCBI Gene 7075] {aka JTK14, LMPHM11, TIE}, his-72 (Histone H3.3 type 2) [NCBI Gene 176660]
- **Diseases:** arrhythmias (MESH:D001145), BE (MESH:D019960)
- **Chemicals:** BE (-), Glucose (MESH:D005947)
- **Species:** C. elegans [taxon 328850]
- **Cell lines:** E003 — Homo sapiens (Human), Melanoma, Cancer cell line (CVCL_B4KF), E065 — Homo sapiens (Human), Melanoma, Cancer cell line (CVCL_EI52), E123 — Mus musculus (Mouse), Factor-dependent cell line (CVCL_HE67)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11802466/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11802466/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC11802466/full.md

---
Source: https://tomesphere.com/paper/PMC11802466