Unbiased estimation of the OLS covariance matrix when the errors are clustered
Tom Boot, Gianmaria Niccodemi, Tom Wansbeek

TL;DR
This paper introduces an unbiased covariance matrix estimator for clustered data under various models, demonstrating its effectiveness especially with cluster-specific treatments and unbalanced clusters through simulation tests.
Contribution
It derives an unbiased covariance estimator for the random-effects model and extends it to more general structures, improving inference accuracy in clustered data analysis.
Findings
Unbiased estimator performs well with cluster-specific treatments.
Choice of estimator impacts test size when regressors vary across clusters.
Proposed estimator is robust even with highly unbalanced clusters.
Abstract
When data are clustered, common practice has become to do OLS and use an estimator of the covariance matrix of the OLS estimator that comes close to unbiasedness. In this paper we derive an estimator that is unbiased when the random-effects model holds. We do the same for two more general structures. We study the usefulness of these estimators against others by simulation, the size of the -test being the criterion. Our findings suggest that the choice of estimator hardly matters when the regressor has the same distribution over the clusters. But when the regressor is a cluster-specific treatment variable, the choice does matter and the unbiased estimator we propose for the random-effects model shows excellent performance, even when the clusters are highly unbalanced.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference
