# UniMap: Type‐Level Integration Enhances Biological Preservation and Interpretability in Single‐Cell Annotation

**Authors:** Haitao Hu, Yue Guo, Fujing Ge, Hao Yin, Hao Zhang, Zhesheng Zhou, Fangjie Yan, Qing Ye, Jialu Wu, Ji Cao, Chang‐Yu Hsieh, Bo Yang

PMC · DOI: 10.1002/advs.202410790 · 2025-02-27

## TL;DR

UniMap improves single-cell dataset integration by preserving biological variability and enhancing interpretability through a novel adversarial network.

## Contribution

UniMap introduces a multiselective adversarial network for type-level integration in single-cell data.

## Key findings

- UniMap outperforms existing methods in preserving biological variability during dataset integration.
- It enhances interpretability by identifying shared and domain-specific cell types.
- UniMap successfully creates high-resolution cell atlases and supports cross-species analysis.

## Abstract

Integrating single‐cell datasets from multiple studies provides a cost‐effective way to build comprehensive cell atlases, granting deeper insights into cellular characteristics across diverse biological systems. However, current data integration methods struggle with interference in partially overlapping datasets and varying annotation granularities. Here, a multiselective adversarial network is introduced for the first time and present UniMap, which functions as a “discerner” to identify and exclude interfering cells from various data sources during dataset integration. Compared to other state‐of‐the‐art methods, UniMap emphasizes type‐level integration and proves to be the best model for preserving biological variability, achieving noticeably higher accuracy in single‐cell automated annotation under various circumstances. Additionally, it enhances interpretability by revealing shared and domain‐specific cell types and providing prediction confidence. The efficacy of UniMap is demonstrated in terms of identifying new cell types, creating high‐resolution cell atlases, annotating cells along developmental trajectories, and performing cross‐species analysis, underscoring its potential as a robust tool for single‐cell research.

UniMap introduces a novel multiselective adversarial network that functions as a discerner in single‐cell dataset integration. It enables type‐level integration, ensuring accurate cell type annotation across diverse biological contexts while preserving shared and domain‐specific features. UniMap outperforms by maintaining biological variability and enhancing interpretability, making it a powerful tool for single‐cell research.

## Full-text entities

- **Genes:** CLEC9A (C-type lectin domain containing 9A) [NCBI Gene 283420] {aka CD370, DNGR-1, DNGR1, UNQ9341}, IL1R2 (interleukin 1 receptor type 2) [NCBI Gene 7850] {aka CD121b, CDw121b, IL-1R-2, IL-1RT-2, IL-1RT2, IL1R2c}, CD8B (CD8 subunit beta) [NCBI Gene 926] {aka CD8B1, CD8beta, LEU2, LY3, LYT3, Ly-3}, RGS2 (regulator of G protein signaling 2) [NCBI Gene 5997] {aka G0S8}, SPON2 (spondin 2) [NCBI Gene 10417] {aka DIL-1, DIL1, M-SPONDIN, MINDIN}, FGFBP2 (fibroblast growth factor binding protein 2) [NCBI Gene 83888] {aka HBP17RP, KSP37}, S100A8 (S100 calcium binding protein A8) [NCBI Gene 6279] {aka 60B8AG, CAGA, CFAG, CGLA, CP-10, L1Ag}, NR5A1 (nuclear receptor subfamily 5 group A member 1) [NCBI Gene 2516] {aka AD4BP, ELP, FTZ1, FTZF1, POF7, SF-1}, C1QB (complement C1q B chain) [NCBI Gene 713] {aka C1QD2}, MAF (MAF bZIP transcription factor) [NCBI Gene 4094] {aka AYGRP, CCA4, CTRCT21, c-MAF}, CCR4 (C-C motif chemokine receptor 4) [NCBI Gene 1233] {aka CC-CKR-4, CD194, CKR4, CMKBR4, ChemR13, HGCN:14099}, TYMS (thymidylate synthetase) [NCBI Gene 7298] {aka DKCD, HST422, TMS, TS}, TYROBP (transmembrane immune signaling adaptor TYROBP) [NCBI Gene 7305] {aka DAP12, KARAP, PLOSL, PLOSL1}, GZMK (granzyme K) [NCBI Gene 3003] {aka GrK, TRYP2}, XCL1 (X-C motif chemokine ligand 1) [NCBI Gene 6375] {aka ATAC, LPTN, LTN, SCM-1, SCM-1a, SCM1}, CDKN1C (cyclin dependent kinase inhibitor 1C) [NCBI Gene 1028] {aka BWCR, BWS, KIP2, WBS, p57, p57Kip2}, IL7R (interleukin 7 receptor) [NCBI Gene 3575] {aka CD127, CDW127, IL-7R-alpha, IL-7Ralpha, IL7RA, IL7Ralpha}, FCGR3A (Fc gamma receptor IIIa) [NCBI Gene 2214] {aka CD16-II, CD16A, FCG3, FCGR3, FCRIIIA, FcGRIIIA}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}, GNLY (granulysin) [NCBI Gene 10578] {aka D2S69E, LAG-2, LAG2, NKG5, TLA519}, FLT3 (fms related receptor tyrosine kinase 3) [NCBI Gene 2322] {aka CD135, FLK-2, FLK2, STK1}, TNFRSF4 (TNF receptor superfamily member 4) [NCBI Gene 7293] {aka ACT35, CD134, IMD16, OX40, TXGP1L}, KLRC1 (killer cell lectin like receptor C1) [NCBI Gene 3821] {aka CD159A, NKG2, NKG2A}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, XCL2 (X-C motif chemokine ligand 2) [NCBI Gene 6846] {aka SCM-1b, SCM1B, SCYC2}, TNFAIP3 (TNF alpha induced protein 3) [NCBI Gene 7128] {aka A20, AIFBL1, AISBL, OTUD7C, TNFA1P2}, FCN1 (ficolin 1) [NCBI Gene 2219] {aka FCNM}, FOXP3 (forkhead box P3) [NCBI Gene 50943] {aka AIID, DIETER, IPEX, JM2, PIDX, XPID}, CEBPE (CCAAT enhancer binding protein epsilon) [NCBI Gene 1053] {aka C/EBP-epsilon, CRP1, IMD108, SGD1, c/EBP epsilon}, NCAM1 (neural cell adhesion molecule 1) [NCBI Gene 4684] {aka CD56, MSK39, NCAM}, C1QC (complement C1q C chain) [NCBI Gene 714] {aka C1Q-C, C1QD3, C1QG}, IFITM1 (interferon induced transmembrane protein 1) [NCBI Gene 8519] {aka 9-27, CD225, DSPA2a, IFI17, LEU13}, FCER1G (Fc epsilon receptor Ig) [NCBI Gene 2207] {aka FCRG}, HMGB2 (high mobility group box 2) [NCBI Gene 3148] {aka HMG2}, MKI67 (marker of proliferation Ki-67) [NCBI Gene 4288] {aka KIA, MIB-, MIB-1, PPP1R105}, ZBTB16 (zinc finger and BTB domain containing 16) [NCBI Gene 7704] {aka PLZF, ZNF145}, C1QA (complement C1q A chain) [NCBI Gene 712] {aka C1QD1}, DDIT4 (DNA damage inducible transcript 4) [NCBI Gene 54541] {aka Dig2, REDD-1, REDD1}, CD14 (CD14 molecule) [NCBI Gene 929], S100A9 (S100 calcium binding protein A9) [NCBI Gene 6280] {aka 60B8AG, CAGB, CFAG, CGLB, L1AG, LIAG}, IKZF2 (IKAROS family zinc finger 2) [NCBI Gene 22807] {aka ANF1A2, HELIOS, ICHAD, IMDIA, ZNF1A2, ZNFN1A2}, GZMB (granzyme B) [NCBI Gene 3002] {aka C11, CCPI, CGL-1, CGL1, CSP-B, CSPB}, CMTM5 (CKLF like MARVEL transmembrane domain containing 5) [NCBI Gene 116173] {aka CKLFSF5}, IL32 (interleukin 32) [NCBI Gene 9235] {aka IL-32alpha, IL-32beta, IL-32delta, IL-32gamma, NK4, TAIF}, HDC (histidine decarboxylase) [NCBI Gene 3067]
- **Diseases:** emphysema (MESH:D004646), MG (MESH:D009157), CVID (MESH:D017074), glaucoma (MESH:D005901), MK (MESH:D007947), tumor (MESH:D009369), ASW (MESH:C000721350), COVID-19 (MESH:D000086382)
- **Species:** Macaca mulatta (rhesus macaque, species) [taxon 9544], Homo sapiens (human, species) [taxon 9606], Macaca fascicularis (crab eating macaque, species) [taxon 9541], Mus musculus (house mouse, species) [taxon 10090]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12021081/full.md

---
Source: https://tomesphere.com/paper/PMC12021081