# scEVE: a single-cell RNA-seq ensemble clustering algorithm capitalizing on the differences of predictions between multiple clustering methods

**Authors:** Yanis Asloudj, Fleur Mougin, Patricia Thébault

PMC · DOI: 10.1093/nargab/lqaf073 · 2025-06-09

## TL;DR

The paper introduces scEVE, a new single-cell RNA-seq clustering algorithm that improves results by leveraging differences between clustering methods rather than minimizing them.

## Contribution

The novel contribution is an ensemble clustering algorithm that addresses uncertainty and resolution limitations in single-cell data analysis.

## Key findings

- scEVE outperforms existing state-of-the-art methods in clustering performance.
- The algorithm successfully addresses the conceptual challenges of uncertainty and multiple resolutions in single-cell data.
- Results suggest biological downstream analyses benefit from the new approach.

## Abstract

Single-cell RNA sequencing measures individual cell transcriptomes in a sample. In the past decade, this technology has motivated the development of hundreds of clustering methods. These methods attempt to group cells into populations by leveraging the similarity of their transcriptomes. Because each method relies on specific hypotheses, their predictions can vary drastically. To address this issue, ensemble algorithms detect cell populations by integrating multiple clustering methods, and minimizing the differences of their predictions. While this approach is sensible, it has yet to address some conceptual challenges in single-cell data science; namely, ensemble algorithms have yet to generate clustering results with uncertainty values and multiple resolutions. In this work, we present an original approach to ensemble clustering that addresses these challenges, by describing the differences between clustering results, rather than minimizing them. We present the scEVE algorithm, and we evaluate it on 15 experimental datasets, and up to 1200 synthetic datasets. Our results reveal that scEVE outperforms the state of the art, and addresses both conceptual challenges. We also highlight how biological downstream analyses will benefit from addressing these challenges. We expect that this work will provide an alternative direction for developing single-cell ensemble clustering algorithms.

## Full-text entities

- **Genes:** IGF2 (insulin like growth factor 2) [NCBI Gene 3481] {aka C11orf43, GRDF, IGF-II, PP9974, SRS3}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, IGFBP2 (insulin like growth factor binding protein 2) [NCBI Gene 3485] {aka IBP2, IGF-BP53}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, H19 (H19 imprinted maternally expressed transcript) [NCBI Gene 283120] {aka ASM, ASM1, BWS, D11S813E, GMRSP, LINC00008}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}
- **Diseases:** non-small-cell lung cancer (MESH:D002289), ependymoma (MESH:D004806), metastasis (MESH:D009362), GBM (MESH:D005910), pancreatic ductal adenocarcinoma (MESH:D021441), melanoma (MESH:D008545), PDAC (MESH:C537768), MLM (OMIM:155600), cancer (MESH:D009369), inflammation (MESH:D007249), inflammatory response (MESH:D018746), glioblastoma (MESH:D005909), CRC (MESH:D015179), AML (MESH:D015470), hypoxia (MESH:D000860)
- **Chemicals:** aricode (-)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12147100/full.md

---
Source: https://tomesphere.com/paper/PMC12147100