# SUMO: an R package for simulating multi-omics data for methods development and testing

**Authors:** Bernard Isekah Osang’ir, Surya Gupta, Ziv Shkedy, Jürgen Claesen

PMC · DOI: 10.1093/bioadv/vbaf264 · 2025-10-22

## TL;DR

SUMO is an R package that generates customizable multi-omics datasets to test and develop new computational methods.

## Contribution

SUMO introduces a flexible framework for simulating multi-omics data with controllable latent structures and noise.

## Key findings

- SUMO allows users to define distinct and shared latent factors in multi-omics datasets.
- The package supports reproducible testing of methods through controlled signal structures.
- SUMO is freely available on CRAN and GitHub for open use and development.

## Abstract

Insights from integrative multi-omics analyses have fueled demand for innovative computational methods and tools in multi-omics research. However, the scarcity of multi-omics datasets with user-defined signal structures hinders the evaluation of these newly developed tools. SUMO (SimUlating Multi-Omics), an open-source R package, was developed to address this gap by enabling the generation of high-quality factor analysis-based datasets with full control over the dataset’s structure such as latent structures, noise, and complexity. Users can configure datasets with distinct and/or shared non-overlapping latent factors, enabling flexible and precise control over the signal structures. Consequently, SUMO allows reproducible testing and validation of methods, fostering methodological innovation.

The SUMO R package is freely available and accessible on the Comprehensive R Archive Network https://doi.org/10.32614/CRAN.package.SUMO and on GitHub https://github.com/lucp12891/SUMO.git under CC-BY 4.0 license.

## Full-text entities

- **Diseases:** CLL (MESH:D015451)

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12630132/full.md

---
Source: https://tomesphere.com/paper/PMC12630132