TL;DR
This paper introduces a penalized likelihood approach for high-dimensional multivariate categorical response regression, enabling variable selection and interpretation, with demonstrated effectiveness through simulations and a cancer risk prediction application.
Contribution
It presents a novel penalized likelihood method for multivariate categorical regression, including an efficient algorithm and theoretical error bounds, extending to semi-supervised and multivariate cases.
Findings
Effective variable selection in high-dimensional settings
Improved interpretability and prediction accuracy
Validated through simulations and cancer risk prediction
Abstract
We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient first order algorithm which we extend to settings where some subjects have only one response variable measured, i.e., the semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
