Grouping predictors via network-wide metrics
Brandon Woosuk Park, Anand N. Vidyashankar, and Tucker S. McElroy

TL;DR
This paper introduces a novel network-based methodology for grouping predictors in high-dimensional regression models, improving interpretability and predictive power while addressing computational and statistical challenges.
Contribution
It proposes a new supervised grouping algorithm using network-wide metrics, along with a model-assisted bootstrap for uncertainty assessment, with proven theoretical properties.
Findings
The method accurately identifies predictor groups in high-dimensional data.
The bootstrap procedure reduces computational complexity.
Numerical experiments demonstrate improved grouping and prediction performance.
Abstract
When multitudes of features can plausibly be associated with a response, both privacy considerations and model parsimony suggest grouping them to increase the predictive power of a regression model. Specifically, the identification of groups of predictors significantly associated with the response variable eases further downstream analysis and decision-making. This paper proposes a new data analysis methodology that utilizes the high-dimensional predictor space to construct an implicit network with weighted edges %and weights on the edges to identify significant associations between the response and the predictors. Using a population model for groups of predictors defined via network-wide metrics, a new supervised grouping algorithm is proposed to determine the correct group, with probability tending to one as the sample size diverges to infinity. For this reason, we establish several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques
