An Aggregation Method for Sparse Logistic Regression
Zhe Liu

TL;DR
This paper introduces an aggregation method for sparse logistic regression that combines multiple models to improve feature selection and predictive accuracy in high-dimensional data, demonstrated through simulations and real genome data.
Contribution
It proposes a novel aggregation approach for sparse logistic regression that balances prediction and interpretability, addressing false positives in feature selection.
Findings
Improved feature selection accuracy in high-dimensional settings
Enhanced predictive performance over traditional L1 regularization
Effective application to genome-wide association data
Abstract
regularized logistic regression has now become a workhorse of data mining and bioinformatics: it is widely used for many classification problems, particularly ones with many features. However, regularization typically selects too many features and that so-called false positives are unavoidable. In this paper, we demonstrate and analyze an aggregation method for sparse logistic regression in high dimensions. This approach linearly combines the estimators from a suitable set of logistic models with different underlying sparsity patterns and can balance the predictive ability and model interpretability. Numerical performance of our proposed aggregation method is then investigated using simulation studies. We also analyze a published genome-wide case-control dataset to further evaluate the usefulness of the aggregation method in multilocus association mapping.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gene expression and cancer classification · Liver Disease Diagnosis and Treatment
MethodsLogistic Regression
