# CellUntangler: Separating distinct biological signals in single-cell data with deep generative models

**Authors:** Sarah Chen, Aviv Regev, Anne Condon, Jiarui Ding

PMC · DOI: 10.1016/j.xgen.2025.101073 · 2025-12-01

## TL;DR

CellUntangler is a deep learning tool that separates overlapping biological signals in single-cell RNA data, such as cell cycle and cell type.

## Contribution

CellUntangler introduces a deep generative model with multiple latent subspaces to disentangle complex biological signals in single-cell data.

## Key findings

- CellUntangler successfully disentangles the cell cycle from other processes in both cycling-only and mixed datasets.
- The model generalizes to separate additional signals like spatial zonation and interferon response.
- It identifies marker genes associated with specific biological signals of interest.

## Abstract

Single-cell RNA sequencing has provided new insights into both intracellular and intercellular processes. However, multiple processes, such as cell-type programs, differentiation, and the cell cycle, often occur simultaneously within one cell. Existing methods typically target a single process and impose restrictive assumptions, risking the loss of valuable biological information. We introduce CellUntangler, a deep generative model that embeds cells into a latent space composed of multiple subspaces, each tailored with an appropriate geometry to capture a distinct signal. Applied to datasets of cycling-only and mixed cycling/non-cycling cells, CellUntangler disentangles the cell cycle from other processes such as cell type. The framework generalizes to disentangle additional signals, including spatial, tissue dissociation, interferon response, and cell-type identity. By providing flexible embeddings to capture various signals, CellUntangler enables selective enhancement or filtering of signals at the gene-expression level, offering a powerful tool for disentangling complex biological processes in single-cell data.

•CellUntangler disentangles multiple biological signals in scRNA-seq data•Applicable to diverse signals such as the cell cycle and spatial zonation•Captures cell-cycle dynamics in datasets with cycling-only and mixed cells•Identifies marker genes associated with signals of interest

CellUntangler disentangles multiple biological signals in scRNA-seq data

Applicable to diverse signals such as the cell cycle and spatial zonation

Captures cell-cycle dynamics in datasets with cycling-only and mixed cells

Identifies marker genes associated with signals of interest

Chen et al. present CellUntangler, a deep-learning model that utilizes a flexible latent space composed of multiple subspaces, allowing for the capture and filtering of biological signals in single-cell RNA sequencing data. CellUntangler demonstrates broad extensibility across a wide range of signals.

## Full-text entities

- **Genes:** Ifnb1 (interferon beta 1, fibroblast) [NCBI Gene 15977] {aka IFN-beta, IFNB, If1da1, Ifb}, Cd14 (CD14 antigen) [NCBI Gene 12475], Ago2 (argonaute RISC catalytic subunit 2) [NCBI Gene 239528] {aka 1110029L17Rik, 2310051F07Rik, Eif2c2, Gerp95, Gm10365, mKIAA4215}, Gpnmb (glycoprotein (transmembrane) nmb) [NCBI Gene 93695] {aka DC-HIL, Dchil, ipd}, Cdc20 (cell division cycle 20) [NCBI Gene 107995] {aka 2310042N09Rik, p55CDC}, Cdk1 (cyclin dependent kinase 1) [NCBI Gene 12534] {aka Cdc2, Cdc2a, p34<CDC2>}, Clec9a (C-type lectin domain family 9, member a) [NCBI Gene 232414] {aka 9830005G06Rik, DNGR-1}, Ifit1 (interferon-induced protein with tetratricopeptide repeats 1) [NCBI Gene 15957] {aka GARG-16, IFI-56K, ISG56, Ifi56, P56}, Cyp4a14 (cytochrome P450, family 4, subfamily a, polypeptide 14) [NCBI Gene 13119], Ms6hm3 (minisatellite 6 hypermutable 3) [NCBI Gene 111469] {aka PC-2}, Ccne2 (cyclin E2) [NCBI Gene 12448], Pck1 (phosphoenolpyruvate carboxykinase 1, cytosolic) [NCBI Gene 18534] {aka PEPCK, PEPCK-C, Pck-1}, Cyp4a10 (cytochrome P450, family 4, subfamily a, polypeptide 10) [NCBI Gene 13117] {aka Cyp4a, D4Rp1, Msp-3, RP1}, Aldh1b1 (aldehyde dehydrogenase 1 family, member B1) [NCBI Gene 72535] {aka 2700007F14Rik}, Isg20 (interferon-stimulated protein) [NCBI Gene 57444] {aka 1600023I01Rik, 2010107M23Rik, 20kDa, DnaQL, HEM45}, Ms6hm (minisatellite 6 hypermutable) [NCBI Gene 17653] {aka PC-1}, Neurog3 (neurogenin 3) [NCBI Gene 11925] {aka Atoh5, Math4B, bHLHa7, ngn3}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, IFNA1 (interferon alpha 1) [NCBI Gene 3439] {aka IFL, IFN, IFN-ALPHA, IFN-alphaD, IFNA13, IFNA@}, ITGA2B (integrin subunit alpha 2b) [NCBI Gene 3674] {aka BDPLT16, BDPLT2, CD41, CD41B, FMAIT2, GP2B}, Fev (FEV transcription factor, ETS family member) [NCBI Gene 260298] {aka Pet-1, Pet1, Pex1, mPet-1}, Slc1a2 (solute carrier family 1 (glial high affinity glutamate transporter), member 2) [NCBI Gene 20511] {aka 1700091C19Rik, 2900019G14Rik, Eaat2, GLT-1, GLT1, MGLT1}, IFNB1 (interferon beta 1) [NCBI Gene 3456] {aka IFB, IFF, IFN-beta, IFNB}, Rrm2 (ribonucleotide reductase M2) [NCBI Gene 20135] {aka R2}, Cyp2a5 (cytochrome P450, family 2, subfamily a, polypeptide 5) [NCBI Gene 13087] {aka CYPIIA5, Coh, Cyp15a2}, Cd209a (CD209a antigen) [NCBI Gene 170786] {aka CD209, CDSIGN, CIRE, DC-SIGN, DC-SIGN1, Dcsign}, Igf1 (insulin-like growth factor 1) [NCBI Gene 16000] {aka C730016P09Rik, Igf-1, Igf-I}, Cyp2f2 (cytochrome P450, family 2, subfamily f, polypeptide 2) [NCBI Gene 13107] {aka Cyp2f}, Trem2 (triggering receptor expressed on myeloid cells 2) [NCBI Gene 83433] {aka TREM-2, Trem2a, Trem2b, Trem2c}, Elovl3 (ELOVL fatty acid elongase 3) [NCBI Gene 12686] {aka CIN-2, Cig30}, Cyp2e1 (cytochrome P450, family 2, subfamily e, polypeptide 1) [NCBI Gene 13106] {aka CYPIIE1, Cyp2e}, Acsl1 (acyl-CoA synthetase long-chain family member 1) [NCBI Gene 14081] {aka Acas, Acas1, Acs, FACS, Facl2, LACS 1}, Xcr1 (chemokine (C motif) receptor 1) [NCBI Gene 23832] {aka Ccxcr1, Gpr5, mXcr1}, Clec10a (C-type lectin domain family 10, member A) [NCBI Gene 17312] {aka CD301a, M-ASGP-BP-1, Mgl, Mgl1}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}
- **Diseases:** EoE (MESH:D057765), inflammation (MESH:D007249), tumor (MESH:D009369), ovarian cancer (MESH:D010051), chronic (MESH:D002908), esophagus (MESH:D004938)
- **Chemicals:** Revelio (-), fatty acids (MESH:D005227), Chromium (MESH:D002857)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232), HeLa — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_0030)

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12903416/full.md

---
Source: https://tomesphere.com/paper/PMC12903416