Benchmarking Preprocessing and Integration Methods in Single-Cell Genomics
Ali Anaissi, Seid Miad Zandavi, Weidong Huang, Junaid Akram, Basem Suleiman, Ali Braytee, Jie Hua

TL;DR
This paper systematically evaluates various preprocessing, normalization, and integration methods for single-cell multimodal data, highlighting optimal combinations and their performance across diverse datasets.
Contribution
It provides a comprehensive benchmarking of preprocessing and integration techniques in single-cell genomics, guiding method selection based on dataset characteristics.
Findings
Harmony and Seurat are top data integration methods.
Harmony is more time-efficient for large datasets.
UMAP pairs well with various integration techniques.
Abstract
Single-cell data analysis has the potential to revolutionize personalized medicine by characterizing disease-associated molecular changes at the single-cell level. Advanced single-cell multimodal assays can now simultaneously measure various molecules (e.g., DNA, RNA, Protein) across hundreds of thousands of individual cells, providing a comprehensive molecular readout. A significant analytical challenge is integrating single-cell measurements across different modalities. Various methods have been developed to address this challenge, but there has been no systematic evaluation of these techniques with different preprocessing strategies. This study examines a general pipeline for single-cell data analysis, which includes normalization, data integration, and dimensionality reduction. The performance of different algorithm combinations often depends on the dataset sizes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cell Image Analysis Techniques · Gene expression and cancer classification
