The Price of Competition: Effect Size Heterogeneity Matters in High Dimensions
Hua Wang, Yachong Yang, Weijie J. Su

TL;DR
This paper investigates how effect size heterogeneity influences model selection in high-dimensional sparse regression, revealing that heterogeneity significantly affects the trade-off between false and true positives in Lasso, especially under linear sparsity.
Contribution
It introduces the concept of effect size heterogeneity and demonstrates its critical role in the performance of Lasso in high-dimensional settings, providing new theoretical insights.
Findings
Maximal effect size heterogeneity leads to optimal false/true positive trade-off.
Minimal heterogeneity causes earlier false variable selection.
Effect size heterogeneity complements sparsity in analyzing high-dimensional regression.
Abstract
In high-dimensional sparse regression, would increasing the signal-to-noise ratio while fixing the sparsity level always lead to better model selection? For high-dimensional sparse regression problems, surprisingly, in this paper we answer this question in the negative in the regime of linear sparsity for the Lasso method, relying on a new concept we term effect size heterogeneity. Roughly speaking, a regression coefficient vector has high effect size heterogeneity if its nonzero entries have significantly different magnitudes. From the viewpoint of this new measure, we prove that the false and true positive rates achieve the optimal trade-off uniformly along the Lasso path when this measure is maximal in a certain sense, and the worst trade-off is achieved when it is minimal in the sense that all nonzero effect sizes are roughly equal. Moreover, we demonstrate that the first false…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Bandit Algorithms Research
