AutoFlow: an interactive Shiny app for supervised and unsupervised flow cytometry analysis
Freya E R Woods, Emilyanne Leonard, Timothy Ebbels, Jonathan Cairns, Rhiannon David

TL;DR
AutoFlow is an R Shiny app that automates flow cytometry analysis using machine learning, making it easier and more accurate for scientists.
Contribution
AutoFlow introduces an accessible, open-source tool for supervised and unsupervised flow cytometry analysis with robust performance on rare cell populations.
Findings
AutoFlow achieved 97.2% accuracy in multiclass classification of bone marrow cells.
For rare cell populations, AutoFlow demonstrated high sensitivity and specificity, up to 87.9% and 99.9% respectively.
The unsupervised workflow identified biologically meaningful cell clusters and candidate populations.
Abstract
Flow cytometry (FC) is a widely used technique for analysing cells or particles based on the fluorescence of specific markers. Thresholds for fluorescence are typically set manually, a laborious, subjective process that scales poorly as FC technology advances. Machine learning (ML) methods can address these issues but often require technical expertise many bench scientists do not possess. Thus, accessible, open-source, and cross-domain ML-based FC tools are needed. We present AutoFlow, an easy-to-use, adaptable R Shiny application for automated flow cytometry (FC) analysis. AutoFlow supports two workflows: supervised and unsupervised learning. The application automates key preprocessing steps including fluorescence compensation, debris exclusion, single-cell identification, viability marker gating, and downstream classification or clustering. Across three datasets, two publicly…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1- —AstraZeneca10.13039/100004325
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cell Image Analysis Techniques · Microfluidic and Bio-sensing Technologies
1 Introduction
Flow cytometry (FC) is a technology to analyse single cells or particles and is widely used in cell biology research for applications including immunology, oncology, and drug discovery and development (Chattopadhyay and Roederer 2010, Woo et al. 2014, Bonilla et al. 2020, Ullas and Sinclair 2024). FC works by first isolating single cells using fluidics, then exposing the particles to laser light of a specific wavelength and detecting this light after scattering or absorption and emission of fluorescence. By staining cells using a panel of specific antibodies conjugated to fluorescent markers, FC can identify cells when combined with downstream ‘gating’ analysis (Adan et al. 2017, McKinnon 2018). Manual gating is the process by which populations of cells expressing or not expressing markers are determined in a 2d plot. FC is an attractive assay in many laboratories, due to its ability to analyse thousands of particles in real time inexpensively, and its flexibility to many different applications. However, as staining panels contain more biomarkers for better granularity of cell type identification and FC technology advances, manual gating analysis becomes more cumbersome. In contrast, automated methods for cellular identification (gating) scale well and offer many other potential benefits—such as improved accuracy and speed, lower bias, better comparability between labs, more explicit quantification of uncertainty, novel cell type discovery, and better potential for integration with downstream ML tasks.
Many ML-based methods for FC have been developed, such as FlowSOM (Flow self-organizing maps), flowMeans, SWIFT, flowClust, and others (Lo et al. 2008, Finak et al. 2009, Lo et al. 2009, Pyne et al. 2009, Aghaeepour et al. 2011, Van Gassen et al. 2015). These broadly rely on a semi-supervised ML approaches—a typical workflow consists of dimensionality reduction and clustering to identify similar groups of cells. Biological domain expertise is subsequently required to link these clusters back to the biological outcome (Pedersen and Olsen 2020). These methods have made great progress; however, they usually require some pre-processing of the data and, due to the need for hands-on coding expertise, are inaccessible to most wet lab scientists. Hauchamps et al. (2024) recently published their automatic pre-processing pipeline for FC data, but it does not extend to downstream analysis.
We present here an approach to use expert knowledge to train supervised machine learning algorithms for different specific-purpose classifications. In addition, we include an unsupervised workflow to aid with exploratory analysis and enable a data-driven approach to identifying cells from FC. This approach uses dimensionality reduction, clustering, and differential expression analysis (Pyne et al. 2009, Aghaeepour et al. 2011, Pedersen and Olsen 2020, Hauchamps et al. 2024).
Our methodology is presented in an easy-to-use, and easy-to-adapt R Shiny application available at github.com/ferwoods/AutoFlow. Autoflow supports flow cytometry standard (.FCS) files for end-to-end processing, enabling biologists and bioinformaticians alike to analyse FC data with ease, saving time and improving reproducibility.
2 Methods
2.1 Benchmarking datasets
We demonstrate both Autoflow workflows (supervised and unsupervised) on three datasets.
The first is a novel dataset generated in-house using the BM-MPS. Briefly, BM-MPS chips are cultured with haematopoietic stem cells (HSCs) and mesenchymal stem cells in a ceramic scaffold mimicking the BM. This enables the study of a full cycle of BM injury and recovery in response to therapeutics recapitulating clinical BM toxicities (Chou et al. 2020). Table S1, available as supplementary data at Bioinformatics online summarizes the cluster of differentiation (CD) markers used to identify each cell type. Further information on FC data acquisition can be found in the Supplementary Material for BM-MPS.
The second and third datasets are publicly available data–namely, the Mosmann Rare (Mosmann et al. 2014) and Nilsson Rare (Rundberg Nilsson et al. 2013) datasets previously published and collated by Weber and Soneson (2019). Mosmann Rare contains peripheral blood cells with labels provided only for a rare type of cell rare cytokine-producing influenza-specific T cells. Nilsson Rare originates from haematopoietic cells, with labels provided for HSCs.
The approach was trained using cross-validation and tested on a held-out validation set—full details and results are described in the Supplementary Material.
2.2 Optional preprocessing
Before either ML workflow, an optional preprocessing pipeline can be applied. This includes margin removal, automated debris detection, QC [margin events and PeacoQC algorithm (Emmaneel et al. 2022)], and viability identification. Debris is removed using a two-component Gaussian mixture model [GMM, via Mclust (Scrucca et al. 2023)] on forward-scatter measurements (FSC-A/H; using a radial summary when both are present). At this stage, raw cytometer values are used (no compensation or transformation), mirroring manual debris screening. The data are then compensated (using the spillover matrix when available) and transformed with a logicle transform. For viable-cell identification, users select a ‘viability’ channel and a two-component GMM proposes a viability threshold, which can be adjusted interactively. Each preprocessing step is fault-tolerant (failures are skipped rather than aborting the run), if all preprocessing fails, the analysis proceeds on raw data. At export, an additional doublet check (FSC.A versus FSC.H) is applied to write singlet flags into the processed FCS files.
2.3 Supervised machine learning
The supervised arm of AutoFlow requires the user to upload a pre-trained model, i.e. one trained on manually gated data. In this paper, all results were generated using a random forest classifier. RF is an ensemble machine-learning method that combines predictions from many decision trees (Breiman 2001), excelling in capturing complex non-linear relationships, handling high-dimensional feature spaces, and mitigating issues such as overfitting.
The app itself is agnostic to model type, provided it is compatible with R’s predict() function, including any caret-based models. These models are uploaded in a ‘bundle’ that contains the fitted model and scaling information. When new data is uploaded, events are z-score normalized based on the mean and standard deviation of the training dataset. At runtime, the app automatically maps features in the uploaded dataset to the expected model features; users can review or adjust these mappings within the interface. If features are missing or mismatched, the app flags them to prevent generating predictions until alignment is complete.
For benchmarking, the Nilsson and Mosmann datasets were split into 70% training and 30% testing sets. For the BM-MPS dataset, the model was trained on Day 14 (selected for maximal lineage diversity) and tested on all other time points. Hyperparameters (mtry and ntree) were tuned via grid search.
2.4 Unsupervised machine learning
The unsupervised version of the application relies on dimensionality reduction using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP, with default parameters), clustering with Leiden by default, (if available, else Louvain), and differential expression analysis (DEA) to identify defining cluster characteristics (Jolliffe 1986, Blondel et al. 2008, McInnes et al. 2018). In this application we utilize the Seurat workflow, which is developed for use with RNA-sequencing datasets (Satija et al. 2015, Stuart et al. 2019). Users can optionally alter clustering resolution and fold change threshold for DEA.
The unsupervised side of Fig. 1 shows a graphical representation of the workflow used in this section of the application.
Workflow used by the AutoFlow application. Data from raw FCS files are supported for analysis and output.
3 Results
3.1 Supervised machine learning
Performance was high across datasets (Table S2, available as supplementary data at Bioinformatics online): 99.99% accuracy for Mosmann Rare, 99.8% for Nilsson Rare, and 97.2% for BM-MPS under a leave-one-timepoint-out scheme.
In BM-MPS, performance across eight major cell types (Table S3, available as supplementary data at Bioinformatics online) showed high sensitivity and specificity, with slightly lower PPV in rare transitional populations due to class imbalance. AutoFlow reduced analysis time from ∼3 hours per sample to <5 minutes for the entire study. Supervised analysis is rapid, taking only seconds to classify new datasets.
3.2 Unsupervised machine learning
For the three benchmarking datasets, we used unsupervised clustering to group cells based on similarities. Subsequently, we performed differential expression analysis on these clustered cells to identify and characterize distinct cell populations within the datasets. Each cluster is assigned a tag e.g. CD235A+CD71+CD34-, and hence domain knowledge is required to identify the expected cell population.
In the Mosmann rare dataset, 109 rare T cells were identified by manual gating, with no other cell types labelled. Using AutoFlow, all manually gated rare cells were recovered within a single cluster, alongside an additional set of cells with highly similar marker profiles (Figs S1 and S2, available as supplementary data at Bioinformatics online), suggesting that AutoFlow may be detecting true rare events that were missed during manual gating.
For the Nilsson dataset, AutoFlow achieved high sensitivity (92.7%) and strong negative predictive value (99.9%), indicating nearly all manually gated rare cells were recovered. However, the positive predictive value was very low (3.9%), reflecting the significant overlap in marker expression between the rare and majority populations (Figs S3 and S4, available as supplementary data at Bioinformatics online). This overlap limited the ability of the unsupervised method to isolate the rare cell population accurately, highlighting a key limitation of the dataset rather than the algorithm.
In the BM-MPS dataset, unsupervised classification aligned closely with manual gating for the bulk lineages, with erythroid, myeloid, and megakaryocytes all showing excellent sensitivity, specificity, and balanced accuracy. This indicates that broad grouping captures the main essence of the haematopoietic lineages. However, transitional states, particularly HSCs/Progenitors and the distinction between early and late granulocytes, were less consistent, underscoring the challenge of rigid categorical boundaries. Together, these results (Tables S6 and S7, available as supplementary data at Bioinformatics online, Figs S5–S7, available as supplementary data at Bioinformatics online) highlight both the strength of unsupervised approaches for identifying major cell types and the need to model haematopoiesis as a continuum when analysing transitional populations.
4 Discussion
The AutoFlow application uses supervised and unsupervised ML to enable cell identification. This app provides a bespoke, fast-running interface to make FC data analysis more accessible. In this study we automatically gated cells of known origin using manually gated data as a training set. In the case of the unsupervised version, cells are identified based on expression of surface markers and compared with manually gated study results.
The supervised version of the application provides a structured and guided approach to data analysis with high performance of >90% accuracy in the three-benchmarking dataset. Researchers can leverage their expertise to define the gating criteria, offering flexibility and the ability to incorporate domain knowledge into the process. This method is particularly useful when dealing with well-understood, standard datasets for which specific populations of interest need to be identified. Hence, the supervised approach streamlines the process, reduces subjectivity, and ensures reproducibility that otherwise remains a challenge in manual gating.
Conversely, the unsupervised gating version caters to the increasingly prevalent challenge of managing complex, high-dimensional data. With this method, the application automatically identifies clusters and patterns within the data, revealing potentially novel and unexpected cell populations. By being data-driven, this method is ideal for exploratory analysis and can unveil hidden insights within heterogeneous datasets. The unsupervised approach enhances the ability to discover rare or unanticipated cell populations, offering a more comprehensive understanding of the biological system under investigation.
The combined use of both supervised and unsupervised versions within the application fosters a comprehensive data analysis pipeline. Researchers can use either workflow or use both by, e.g. using supervised gating at an early stage to refine known populations and subsequently apply the unsupervised approach to uncover novel and more subtle cell types, or outliers. Used together, these workflows ensure that both well-established and hidden insights are fully explored, ultimately enhancing the depth and quality of the analysis.
Nonetheless, a key limitation lies in the underlying data. Both supervised and unsupervised methods rely on the discriminatory power of the chosen markers; when marker expression is noisy or overlapping, algorithms cannot fully resolve certain populations. This reflects constraints of experimental design rather than AutoFlow itself, underscoring the importance of thoughtful panel design in future applications.
In summary, the application’s integration of supervised and unsupervised gating methods brings a holistic approach to FC data analysis. It equips researchers with a versatile toolset to tackle various research questions, from precise quantification of known populations to the discovery of unexpected and novel cell subsets. This capability holds great promise for advancing the understanding of complex biological systems and it has the potential to drive transformation across research domains.
Supplementary Material
btag078_Supplementary_Data
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Adan A , Alizada G, Kiraz Y et al Flow cytometry: basic principles and applications. Crit Rev Biotechnol 2017;37:163–76. 10.3109/07388551.2015.112887626767547 · doi ↗ · pubmed ↗
- 2Aghaeepour N , Nikolic R, Hoos HH et al Rapid cell population identification in flow cytometry data. Cytometry A 2011;79:6–13. 10.1002/cyto.a.2100721182178 PMC 3137288 · doi ↗ · pubmed ↗
- 3Blondel VD , Guillaume J-L, Lambiotte R et al Fast unfolding of communities in large networks. J Stat Mech 2008;2008:P 10008. 10.1088/1742-5468/2008/10/P 10008 · doi ↗
- 4Bonilla DL , Reinin G, Chua E. Full spectrum flow cytometry as a powerful technology for cancer immunotherapy research. Front Mol Biosci 2020;7:612801. 10.3389/fmolb.2020.61280133585561 PMC 7878389 · doi ↗ · pubmed ↗
- 5Breiman L. Random forests. Mach Learn 2001;45:5–32. 10.1023/A:1010933404324 · doi ↗
- 6Chattopadhyay PK , Roederer M. Good cell, bad cell: flow cytometry reveals T-cell subsets important in HIV disease. Cytometry A 2010;77:614–22. 10.1002/cyto.a.2090520583275 PMC 2907059 · doi ↗ · pubmed ↗
- 7Chou DB , Frismantas V, Milton Y et al On-chip recapitulation of clinical bone marrow toxicities and patient-specific pathophysiology. Nat Biomed Eng 2020;4:394–406. 10.1038/s 41551-019-0495-z 31988457 PMC 7160021 · doi ↗ · pubmed ↗
- 8Emmaneel A , Quintelier K, Sichien D et al Peaco QC: peak-based selection of high quality cytometry data. Cytometry A 2022;101:325–38. 10.1002/cyto.a.2450134549881 PMC 9293479 · doi ↗ · pubmed ↗
