Binned multinomial logistic regression for integrative cell type annotation
Keshav Motwani, Rhonda Bacher, and Aaron J. Molstad

TL;DR
This paper introduces a novel binned multinomial logistic regression method for integrating multiple single-cell datasets with inconsistent labels, improving cell type annotation accuracy and resolution.
Contribution
It proposes a new estimator and optimization algorithm for unified cell type probability modeling across diverse datasets, enhancing annotation consistency.
Findings
Outperforms existing methods in simulation accuracy.
Effectively predicts fine-resolution cell types.
Refines coarse annotations in real datasets.
Abstract
Categorizing individual cells into one of many known cell type categories, also known as cell type annotation, is a critical step in the analysis of single-cell genomics data. The current process of annotation is time-intensive and subjective, which has led to different studies describing cell types with labels of varying degrees of resolution. While supervised learning approaches have provided automated solutions to annotation, there remains a significant challenge in fitting a unified model for multiple datasets with inconsistent labels. In this article, we propose a new multinomial logistic regression estimator which can be used to model cell type probabilities by integrating multiple datasets with labels of varying resolution. To compute our estimator, we solve a nonconvex optimization problem using a blockwise proximal gradient descent algorithm. We show through simulation studies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene expression and cancer classification · Machine Learning and Algorithms
