Simulation-calibration testing for inference in Lasso regressions
Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed

TL;DR
This paper introduces a simulation-based calibration test for assessing variable significance in Lasso regressions, enabling model selection with controlled error rates, applicable to linear and generalized linear models.
Contribution
It develops a novel simulation-calibration testing procedure for Lasso path variables, controlling Family-Wise Error Rate and applicable to generalized linear models.
Findings
The test controls false positive risk in linear models.
Simulation studies demonstrate effective model selection.
Application to pharmacovigilance data identifies relevant exposures.
Abstract
We propose a test of the significance of a variable appearing on the Lasso path and use it in a procedure for selecting one of the models of the Lasso path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a set A of already selected variables and states that it contains all the active variables. We focus on the regularization parameter value from which a first variable outside A is selected. As the test statistic, we use this quantity's conditional p-value, which we define conditional on the non-penalized estimated coefficients of the model restricted to A. We estimate this by simulating outcome vectors and then calibrating them on the observed outcome's estimated coefficients. We adapt the calibration heuristically to the case of generalized linear models in which it turns into an iterative stochastic procedure. We prove that the test controls the risk of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
