# TEAM: A Multiple Testing Algorithm on the Aggregation Tree for Flow   Cytometry Analysis

**Authors:** John Pura, Xuechan Li, Cliburn Chan, Jichun Xie

arXiv: 1906.07757 · 2021-03-29

## TL;DR

TEAM is a novel, efficient multiple testing algorithm that accurately identifies differential regions in flow cytometry data, enabling precise detection of responsive immune cells with controlled false discovery rate.

## Contribution

This paper introduces TEAM, a new aggregation tree-based multiple testing method that improves detection accuracy and computational efficiency in flow cytometry analysis.

## Key findings

- Successfully identified responsive T cell populations
- Outperformed existing methods in speed and interpretability
- Proved asymptotic validity and robustness

## Abstract

In immunology studies, flow cytometry is a commonly used multivariate single-cell assay. One key goal in flow cytometry analysis is to pinpoint the immune cells responsive to certain stimuli. Statistically, this problem can be translated into comparing two protein expression probability density functions (PDFs) before and after the stimulus; the goal is to pinpoint the regions where these two pdfs differ. In this paper, we model this comparison as a multiple testing problem. First, we partition the sample space into small bins. In each bin we form a hypothesis to test the existence of differential pdfs. Second, we develop a novel multiple testing method, called TEAM (Testing on the Aggregation tree Method), to identify those bins that harbor differential pdfs while controlling the false discovery rate (FDR) under the desired level. TEAM embeds the testing procedure into an aggregation tree to test from fine- to coarse-resolution. The procedure achieves the statistical goal of pinpointing differential pdfs to the smallest possible regions. TEAM is computationally efficient, capable of analyzing large flow cytometry data sets in much shorter time compared with competing methods. We applied TEAM and competing methods on a flow cytometry data set to identify T cells responsive to the cytomeglovirus (CMV)-pp65 antigen stimulation. TEAM successfully identified the monofunctional, bifunctional, and polyfunctional T cells while the competing methods either did not finish in a reasonable time frame or provided less interpretable results. Numerical simulations and theoretical justifications demonstrate that TEAM has asymptotically valid, powerful, and robust performance. Overall, TEAM is a computationally efficient and statistically powerful algorithm that can yield meaningful biological insights in flow cytometry studies.

---
Source: https://tomesphere.com/paper/1906.07757