A systematic evaluation of methods for cell phenotype classification using single-cell RNA sequencing data
Xiaowen Cao, Li Xing, Elham Majd, Hua He, Junhua Gu, Xuekui Zhang

TL;DR
This paper systematically evaluates 13 supervised machine learning algorithms for cell phenotype classification using single-cell RNA sequencing data, highlighting their performance across different dataset sizes and providing practical recommendations.
Contribution
It offers a comprehensive benchmark of supervised algorithms for cell phenotype classification in scRNA-seq data, including performance metrics and gene selection evaluation.
Findings
ElasticNet with interactions excels in small and medium datasets.
Naive Bayes performs well in medium datasets.
XGBoost is best suited for large datasets.
Abstract
Background: Single-cell RNA sequencing (scRNA-seq) yields valuable insights about gene expression and gives critical information about complex tissue cellular composition. In the analysis of single-cell RNA sequencing, the annotations of cell subtypes are often done manually, which is time-consuming and irreproducible. Garnett is a cell-type annotation software based the on elastic net method. Besides cell-type annotation, supervised machine learning methods can also be applied to predict other cell phenotypes from genomic data. Despite the popularity of such applications, there is no existing study to systematically investigate the performance of those supervised algorithms in various sizes of scRNA-seq data sets. Methods and Results: This study evaluates 13 popular supervised machine learning algorithms to classify cell phenotypes, using published real and simulated data sets with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics
