SUMO: an R package for simulating multi-omics data for methods development and testing
Bernard Isekah Osang’ir, Surya Gupta, Ziv Shkedy, Jürgen Claesen

TL;DR
SUMO is an R package that generates customizable multi-omics datasets to test and develop new computational methods.
Contribution
SUMO introduces a flexible framework for simulating multi-omics data with controllable latent structures and noise.
Findings
SUMO allows users to define distinct and shared latent factors in multi-omics datasets.
The package supports reproducible testing of methods through controlled signal structures.
SUMO is freely available on CRAN and GitHub for open use and development.
Abstract
Insights from integrative multi-omics analyses have fueled demand for innovative computational methods and tools in multi-omics research. However, the scarcity of multi-omics datasets with user-defined signal structures hinders the evaluation of these newly developed tools. SUMO (SimUlating Multi-Omics), an open-source R package, was developed to address this gap by enabling the generation of high-quality factor analysis-based datasets with full control over the dataset’s structure such as latent structures, noise, and complexity. Users can configure datasets with distinct and/or shared non-overlapping latent factors, enabling flexible and precise control over the signal structures. Consequently, SUMO allows reproducible testing and validation of methods, fostering methodological innovation. The SUMO R package is freely available and accessible on the Comprehensive R Archive Network…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks
