Heteroskedasticity-robust inference in linear regression models with many covariates
Koen Jochmans

TL;DR
This paper develops a new heteroskedasticity-robust covariance estimator for linear regression models with many control variables, ensuring valid inference when the number of controls grows proportionally with the sample size.
Contribution
It introduces an alternative covariance estimator that remains consistent in high-dimensional settings, extending recent work and providing conditions for size-correct inference.
Findings
The new estimator corrects size distortions in large samples.
Simulation results demonstrate improved inference accuracy.
Empirical application on union premium illustrates practical benefits.
Abstract
We consider inference in linear regression models that is robust to heteroskedasticity and the presence of many control variables. When the number of control variables increases at the same rate as the sample size the usual heteroskedasticity-robust estimators of the covariance matrix are inconsistent. Hence, tests based on these estimators are size distorted even in large samples. An alternative covariance-matrix estimator for such a setting is presented that complements recent work by Cattaneo, Jansson and Newey (2018). We provide high-level conditions for our approach to deliver (asymptotically) size-correct inference as well as more primitive conditions for three special cases. Simulation results and an empirical illustration to inference on the union premium are also provided.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
