Data-Driven Random Projection and Screening for High-Dimensional Generalized Linear Models
Roman Parzer, Peter Filzmoser, Laura Vana-G\"ur

TL;DR
This paper introduces a data-driven random projection and screening method for high-dimensional generalized linear models, effectively handling correlated predictors and varying sparsity levels, with demonstrated improvements in prediction and interpretability.
Contribution
It proposes a novel, data-driven approach combining ridge-based screening and random projection, outperforming traditional methods like elastic net in high-dimensional GLMs.
Findings
Improved prediction accuracy over conventional methods.
Effective variable screening with low computational cost.
Enhanced interpretability in applications with count and binary responses.
Abstract
We address the challenge of correlated predictors in high-dimensional GLMs, where regression coefficients range from sparse to dense, by proposing a data-driven random projection method. This is particularly relevant for applications where the number of predictors is (much) larger than the number of observations and the underlying structure -- whether sparse or dense -- is unknown. We achieve this by using ridge-type estimates for variable screening and random projection to incorporate information about the response-predictor relationship when performing dimensionality reduction. We demonstrate that a ridge estimator with a small penalty is effective for random projection and screening, but the penalty value must be carefully selected. Unlike in linear regression, where penalties approaching zero work well, this approach leads to overfitting in non-Gaussian families. Instead, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Soil Geostatistics and Mapping
