Simulation-Selection-Extrapolation: Estimation in High-Dimensional Errors-in-Variables Models
Linh Nghiem, Cornelis Potgieter

TL;DR
This paper introduces SIMSELEX, a novel high-dimensional errors-in-variables estimation method that improves variable selection and reduces bias in models with measurement error, applicable across various statistical frameworks.
Contribution
The paper proposes SIMSELEX, combining simulation, selection, and extrapolation with group lasso, to enhance estimation accuracy in high-dimensional errors-in-variables models.
Findings
SIMSELEX outperforms naive estimators in variable selection.
It achieves lower estimation error than existing methods.
Successfully applied to gene expression data in cancer research.
Abstract
This paper considers errors-in-variables models in a high-dimensional setting where the number of covariates can be much larger than the sample size, and there are only a small number of non-zero covariates. The presence of measurement error in the covariates can result in severely biased parameter estimates, and also affects the ability of penalized methods such as the lasso to recover the true sparsity pattern. A new estimation procedure called SIMSELEX (SIMulation-SELection-EXtrapolation) is proposed. This procedure augments the traditional SIMEX approach with a variable selection step based on the group lasso. The SIMSELEX estimator is shown to perform well in variable selection, and has significantly lower estimation error than naive estimators that ignore measurement error. SIMSELEX can be applied in a variety of errors-in-variables settings, including linear models, generalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
