Omitted variable bias of Lasso-based inference methods: A finite sample analysis
Kaspar Wuthrich, Ying Zhu

TL;DR
This paper analyzes the finite sample biases of Lasso-based inference methods, revealing they can suffer from significant omitted variable bias even with large samples, challenging existing asymptotic inference assumptions.
Contribution
It provides a finite sample analysis of Lasso-based inference methods, highlighting their limitations and comparing them to high-dimensional OLS approaches.
Findings
Lasso-based methods can have substantial omitted variable bias in finite samples.
Bias persists even with large samples and sparse coefficients.
Comparison shows limitations of Lasso-based inference relative to OLS methods.
Abstract
We study the finite sample behavior of Lasso-based inference methods such as post double Lasso and debiased Lasso. We show that these methods can exhibit substantial omitted variable biases (OVBs) due to Lasso not selecting relevant controls. This phenomenon can occur even when the coefficients are sparse and the sample size is large and larger than the number of controls. Therefore, relying on the existing asymptotic inference theory can be problematic in empirical applications. We compare the Lasso-based inference methods to modern high-dimensional OLS-based methods and provide practical guidance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Evolution and Genetic Dynamics
