Multiresolution categorical regression for interpretable cell type annotation
Aaron J. Molstad, Keshav Motwani

TL;DR
This paper introduces a scalable, data-driven multinomial logistic regression method for multiresolution categorical response data, enabling interpretable cell type annotation from gene expression profiles with improved biological insights.
Contribution
The authors develop a unified, scalable algorithm for high-dimensional multiresolution categorical regression that identifies predictor effects at multiple resolution levels, enhancing interpretability.
Findings
Method effectively models cell type probabilities from gene expression.
Algorithm identifies predictors relevant at different resolution levels.
Provides biological insights for cell type annotation.
Abstract
In many categorical response regression applications, the response categories admit a multiresolution structure. That is, subsets of the response categories may naturally be combined into coarser response categories. In such applications, practitioners are often interested in estimating the resolution at which a predictor affects the response category probabilities. In this article, we propose a method for fitting the multinomial logistic regression model in high dimensions that addresses this problem in a unified and data-driven way. In particular, our method allows practitioners to identify which predictors distinguish between coarse categories but not fine categories, which predictors distinguish between fine categories, and which predictors are irrelevant. For model fitting, we propose a scalable algorithm that can be applied when the coarse categories are defined by either…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning and Data Classification · Optimal Experimental Design Methods
