High-dimensional regression over disease subgroups
Frank Dondelinger, Sach Mukherjee, The Alzheimer's Disease, Neuroimaging Initiative

TL;DR
This paper introduces a penalized high-dimensional regression method for disease subgroups, enabling joint estimation that leverages subgroup similarities to improve prediction and interpretability in biomedical data.
Contribution
It proposes a novel joint estimation framework combining sparsity and subgroup similarity penalties, tailored for limited sample sizes in biomedical subgroup analysis.
Findings
Improved prediction accuracy on simulated and real biomedical datasets.
Effective estimation of subgroup-specific sparsity patterns.
Demonstrated benefits in diseases like Alzheimer's, ALS, and cancer.
Abstract
We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where disease subtypes, for example, may differ with respect to underlying regression models, but sample sizes at the subgroup-level may be limited. We focus on the case in which subgroup-specific models may be expected to be similar but not necessarily identical. Our approach is to treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer's disease, amyotrophic lateral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Bayesian Methods and Mixture Models
