Low dimensional representation of multi-patient flow cytometry datasets using optimal transport for minimal residual disease detection in leukemia
Erell Gachon, J\'er\'emie Bigot, Elsa Cazelles, Audrey Bidet,, Jean-Philippe Vial, Pierre-Yves Dumas, Aguirre Mimoun

TL;DR
This paper introduces an optimal transport-based method for low-dimensional visualization and clustering of multi-patient flow cytometry data to improve minimal residual disease detection in leukemia.
Contribution
It proposes a novel OT-based framework for representing high-dimensional FCM datasets in low dimensions, enhancing visualization and analysis of MRD in AML.
Findings
OT-based approach outperforms kernel mean embedding techniques.
Enables effective 2D visualization of patient data.
Improves clustering of MRD levels in AML.
Abstract
Representing and quantifying Minimal Residual Disease (MRD) in Acute Myeloid Leukemia (AML), a type of cancer that affects the blood and bone marrow, is essential in the prognosis and follow-up of AML patients. As traditional cytological analysis cannot detect leukemia cells below 5\%, the analysis of flow cytometry dataset is expected to provide more reliable results. In this paper, we explore statistical learning methods based on optimal transport (OT) to achieve a relevant low-dimensional representation of multi-patient flow cytometry measurements (FCM) datasets considered as high-dimensional probability distributions. Using the framework of OT, we justify the use of the K-means algorithm for dimensionality reduction of multiple large-scale point clouds through mean measure quantization by merging all the data into a single point cloud. After this quantization step, the visualization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
