Testing for the appropriate level of clustering in linear regression models
James G. MacKinnon, Morten {\O}rregaard Nielsen, Matthew D. Webb

TL;DR
This paper introduces two statistical tests and a sequential procedure to determine the appropriate level of clustering in linear regression models, addressing a common assumption in empirical research.
Contribution
It proposes novel tests for selecting the correct clustering level in regression, with bootstrap implementations and practical guidance.
Findings
Bootstrap tests perform well under the null hypothesis.
Tests have high power to detect coarser clustering.
Empirical application shows improved inference accuracy.
Abstract
The overwhelming majority of empirical research that uses cluster-robust inference assumes that the clustering structure is known, even though there are often several possible ways in which a dataset could be clustered. We propose two tests for the correct level of clustering in regression models. One test focuses on inference about a single coefficient, and the other on inference about two or more coefficients. We provide both asymptotic and wild bootstrap implementations. The proposed tests work for a null hypothesis of either no clustering or ``fine'' clustering against alternatives of ``coarser'' clustering. We also propose a sequential testing procedure to determine the appropriate level of clustering. Simulations suggest that the bootstrap tests perform very well under the null hypothesis and can have excellent power. An empirical example suggests that using the tests leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Statistical Methods and Inference
MethodsTest
