# One-sample missing DNA-methylation value imputation

**Authors:** Christelle Kemda Ngueda, Julia Palm, Flavia Remo, André Scherag, Lutz Leistritz

PMC · DOI: 10.1186/s12859-025-06154-9 · BMC Bioinformatics · 2025-05-31

## TL;DR

This paper introduces a new method for imputing missing DNA methylation values using only a single sample, which is useful for personalized medicine when data is limited.

## Contribution

The novel contribution is the development of OSMI, a single-sample DNA methylation imputation method with low computational requirements.

## Key findings

- OSMI achieved an average imputation accuracy of RMSE = 0.2713 in β-value units.
- Imputation accuracy improves when considering CpG island affiliations and increases with CpG site density.
- OSMI is suitable for single-sample applications but less accurate than multi-sample methods when multiple similar samples are available.

## Abstract

Currently, the most popular methods for missing DNA-methylation value imputation rely on exploiting methylation patterns across multiple samples from the same population. However, if there is significant variability between individuals or limited data available, these methods might produce biased results. This situation has prompted researchers to seek alternative approaches for handling single-sample data, particularly in the context of personalized medicine. Accordingly, we propose One-Sample Methyl Imputation (OSMI), an imputation method that can also be used in single-sample applications.

The proposed method in single-subject cases yielded an average imputation accuracy of RMSE = 0.2713 (95%-CI from 0.2696 to 0.2730) in β-value units (range: 0–1) based on real 450 K BeadChip data sets of 3,402 individuals. It is possible to take the affiliation of individual CpGs to CpG islands into account during the imputation of missing methylation values. This improves the imputation accuracy. In addition, the accuracy of imputation depends in general on the density of CpG sites on DNA-methylation microarrays and increases as the CpG site density increases. OSMI has low memory and computational requirements.

OSMI uses a single methylome to impute missing values quickly at very low memory constraints. Its imputation accuracy is inferior to other methods if multiple samples are available and these samples are reasonably similar, but OSMI represents a useful addition to the imputation toolbox for the case of single-sample applications.

The online version contains supplementary material available at 10.1186/s12859-025-06154-9.

## Full-text entities

- **Diseases:** tumour (MESH:D009369), OSMI (MESH:C535434), genetic disorders (MESH:D030342), rare (MESH:D035583)
- **Chemicals:** Cytosine-phosphate-guanine (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12126866/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12126866/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/PMC12126866/full.md

---
Source: https://tomesphere.com/paper/PMC12126866