A powerful penalized multinomial logistic regression approach
Cornelia Fuetterer, Malte Nalenz, Thomas Augustin, Ruth M. Pfeiffer

TL;DR
This paper introduces a new penalized regression method called DP-lasso for categorical outcomes, which improves variable selection in high-dimensional data.
Contribution
The novel DP-lasso method uses adaptive L1-type penalties based on predictor distances across outcome categories.
Findings
DP-lasso with ANOVA-based weights (DPan) produced sparser models with high true positive rates in high-dimensional settings.
DPan outperformed other methods in terms of false positive rates across various simulation scenarios.
The method was successfully applied to ultra high-dimensional single-cell RNA-sequencing datasets.
Abstract
Penalized regression methods that shrink model coefficients are popular approaches to improve prediction and for variable selection in high-dimensional settings. We present a penalized (or regularized) regression approach for multinomial logistic models for categorical outcomes with a novel adaptive L1-type penalty term, that incorporates weights based on intra- and inter-outcome category distances of each predictor. A predictor that has large between- and small within-outcome category distances is penalized less and has a higher likelihood to be selected for the final model. We propose and study three measures for weight calculation: an analysis of variance (ANOVA)-based measure and two indices used in clustering approaches. Our novel approach, that we term the discriminative power lasso (DP-lasso), thus combines elements of marginal screening with regularized regression methods. We…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Fuzzy Systems and Optimization · Optimal Experimental Design Methods
