# Integration of unpaired and heterogeneous clinical flow cytometry data

**Authors:** Mike Phuycharoen, Verena Kaestele, Thomas Williams, Lijing Lin, Tracy Hussell, John Grainger, Magnus Rattray

PMC · DOI: 10.1016/j.isci.2026.114937 · iScience · 2026-02-07

## TL;DR

UVAE is a new method that integrates diverse flow cytometry data from clinical samples to improve analysis and predictions in diseases like COVID-19.

## Contribution

UVAE introduces a semi-supervised variational autoencoder framework for integrating unpaired biomedical data while correcting batch effects and rebalancing cell-type proportions.

## Key findings

- UVAE enables integration of heterogeneous clinical flow cytometry data from multiple panels.
- The model improves downstream prediction tasks by reducing batch effects and enhancing statistical signals of disease-related cell types.
- UVAE effectively handles changes in panel designs during studies and improves longitudinal disease severity predictions.

## Abstract

We introduce the Unbiasing Variational Autoencoder (UVAE), a computational framework for the integration of unpaired biomedical data streams such as clinical flow cytometry. UVAE addresses batch effect correction and data alignment by training a semi-supervised model on partially labeled datasets, enabling simultaneous normalization and integration of diverse data within a shared latent space. The framework implements a probabilistic model for batch effect normalization and balances class contents during training to ensure accurate representation of underlying cell composition. We apply UVAE to integrate heterogeneous clinical flow cytometry data from COVID-19 patients. The integrated data enhances the statistical signal of cell types associated with disease severity, enables clustering of subpopulations without the impediment of batch effects, and improves the performance of longitudinal regression for predicting peak disease severity from temporal patient samples.

•UVAE integrates data from multiple flow cytometry panels collected from clinical samples•UVAE deals effectively with changes to panel designs during study•Data rebalancing during batch correction controls for differences in cell-type proportions•Model-imputed data improves downstream prediction tasks

UVAE integrates data from multiple flow cytometry panels collected from clinical samples

UVAE deals effectively with changes to panel designs during study

Data rebalancing during batch correction controls for differences in cell-type proportions

Model-imputed data improves downstream prediction tasks

Biocomputational method; Bioinformatics

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096)

## Full-text entities

- **Genes:** FCGR3A (Fc gamma receptor IIIa) [NCBI Gene 2214] {aka CD16-II, CD16A, FCG3, FCGR3, FCRIIIA, FcGRIIIA}, CD14 (CD14 molecule) [NCBI Gene 929], MME (membrane metalloendopeptidase) [NCBI Gene 4311] {aka CALLA, CD10, CMT2T, NEP, SCA43, SFE}, CD86 (CD86 molecule) [NCBI Gene 942] {aka B7-2, B7.2, B70, BU63, CD28LG2, CD86 v6}, CCR2 (C-C motif chemokine receptor 2) [NCBI Gene 729230] {aka CC-CKR-2, CCR-2, CCR2A, CCR2B, CD192, CKR2}, CX3CR1 (C-X3-C motif chemokine receptor 1) [NCBI Gene 1524] {aka CCRL1, CMKBRL1, CMKDR1, GPR13, GPRV28, V28}, ITGA4 (integrin subunit alpha 4) [NCBI Gene 3676] {aka CD49D, IA4}, ITGAX (integrin subunit alpha X) [NCBI Gene 3687] {aka CD11C, SLEB6}, CD274 (CD274 molecule) [NCBI Gene 29126] {aka ADMIO5, B7-H, B7H1, PD-L1, PDCD1L1, PDCD1LG1}, SELL (selectin L) [NCBI Gene 6402] {aka CD62L, LAM1, LECAM1, LEU8, LNHR, LSEL}, ITGAM (integrin subunit alpha M) [NCBI Gene 3684] {aka CD11B, CR3A, HNA-4, MAC-1, MAC1A, MO1A}, FCGR1A (Fc gamma receptor Ia) [NCBI Gene 2209] {aka CD64, CD64A, FCG1, FCGR1, FCRI, FcgammaRI}, PTPRC (protein tyrosine phosphatase receptor type C) [NCBI Gene 5788] {aka B220, CD45, CD45R, GP180, IMD105, L-CA}
- **Diseases:** COVID (MESH:D000086382), neutrophilia (MESH:C563010), acute COVID-19 (MESH:D000094024), MMD (MESH:D009800), lymphopenia (MESH:D008231), Cancer (MESH:D009369), EMD (MESH:C535290), CLL (MESH:D015451), LISI (MESH:C537340)
- **Chemicals:** UVAE (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Gammacoronavirus (genus) [taxon 694013]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12964232/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12964232/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/PMC12964232/full.md

---
Source: https://tomesphere.com/paper/PMC12964232