Unified and robust Lagrange multiplier type tests for cross-sectional independence in large panel data models
Zhenhong Huang, Zhaoyuan Li, Jianfeng Yao

TL;DR
This paper introduces a unified, robust Lagrange multiplier test for detecting cross-sectional dependence in large panel data models, applicable across various model types and error distributions, with theoretical validation and simulation support.
Contribution
It develops a unified test procedure and a power enhancement version for cross-sectional independence, valid under broad panel data settings and error distributions.
Findings
The tests are asymptotically valid under large panel asymptotics.
Monte Carlo experiments confirm robustness and power of the proposed tests.
The power enhancement technique improves detection capabilities.
Abstract
This paper revisits the Lagrange multiplier type test for the null hypothesis of no cross-sectional dependence in large panel data models. We propose a unified test procedure and its power enhancement version, which show robustness for a wide class of panel model contexts. Specifically, the two procedures are applicable to both heterogeneous and fixed effects panel data models with the presence of weakly exogenous as well as lagged dependent regressors, allowing for a general form of nonnormal error distribution. With the tools from Random Matrix Theory, the asymptotic validity of the test procedures is established under the simultaneous limit scheme where the number of time periods and the number of cross-sectional units go to infinity proportionally. The derived theories are accompanied by detailed Monte Carlo experiments, which confirm the robustness of the two tests and also suggest…
| Heterogeneous coefficients | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Fixed effects panels | * | * | ✓ | * | ✓ | ✓ |
| Dynamic panels | ✓ | * | ✓ | * | ✓ | ✓ |
| Weakly exogenous regressors | * | * | ✓ | ✓ | ||
| Non-normal errors | ✓ | * | * | * | ✓ | ✓ |
| SIM-L | ✓ | ✓ | ✓ | ✓ | ✓ |
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 5.15 | 5.15 | 5.55 | 5.55 | 4.65 | 5.35 | 4.9 | 5.25 | 5.15 | ||
| 5.4 | 5.7 | 5.6 | 5.2 | 4.45 | 5.8 | 4.8 | 5 | 5.55 | ||
| 5.85 | 5.35 | 5.85 | 6 | 4.7 | 5.45 | 5.4 | 5.45 | 5.25 | ||
| 4.75 | 4.45 | 4.65 | 4.85 | 4.25 | 5.25 | 4.5 | 4.15 | 5.1 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 5.5 | 5.15 | 4.55 | 5.05 | 4.5 | 4.95 | 5.15 | 4.95 | 6 | ||
| 5.5 | 5.9 | 4.75 | 4.45 | 4.85 | 4.9 | 5.05 | 4.7 | 5.6 | ||
| 5.7 | 5.25 | 4.6 | 5.1 | 4.6 | 5 | 5.35 | 4.95 | 6 | ||
| 5.4 | 4.85 | 5.05 | 5.25 | 5.2 | 5.1 | 5.2 | 4.75 | 5.2 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 5.8 | 5.3 | 6.2 | 5.1 | 5.45 | 5 | 5.4 | 5.6 | 5.5 | ||
| 5.5 | 5 | 5.55 | 4.95 | 5.55 | 4.65 | 5.4 | 5.85 | 5.35 | ||
| 5.7 | 5.3 | 6.2 | 5.05 | 5.45 | 5 | 5.2 | 5.5 | 5.5 | ||
| 4.05 | 4.75 | 5.7 | 5 | 4.85 | 5.4 | 4.85 | 5.05 | 4.9 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 5.65 | 5.05 | 5.65 | 5.5 | 5.25 | 5.4 | 5.3 | 5.4 | 5.3 | ||
| 5.4 | 5.3 | 5.5 | 5.35 | 5.2 | 5 | 5.4 | 5.65 | 5.45 | ||
| 5.75 | 5.05 | 5.75 | 5.6 | 5.25 | 5.4 | 5.4 | 5.45 | 5.3 | ||
| 4.95 | 5.1 | 4.55 | 5.45 | 5.3 | 5.1 | 4.85 | 5.35 | 5.2 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 7.3 | 5 | 5.25 | 6.45 | 6 | 5.35 | 6.4 | 6.25 | 5.3 | ||
| 7.1 | 5.35 | 5.05 | 5.75 | 5.7 | 5.35 | 6.2 | 5.5 | 5.1 | ||
| 6.7 | 4.55 | 5.2 | 5.75 | 5.6 | 5.2 | 5.6 | 5.7 | 5.1 | ||
| 5.55 | 4.75 | 5 | 4.35 | 4.7 | 5.25 | 4.3 | 4.4 | 5.35 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 6.75 | 5.85 | 6.25 | 7.65 | 5.9 | 5.55 | 7.55 | 5.75 | 4.95 | ||
| 6.25 | 6.15 | 5.55 | 7.4 | 5.75 | 5.55 | 7.25 | 5.75 | 4.95 | ||
| 5 | 5 | 5.55 | 5.8 | 4.75 | 5.1 | 5.65 | 4.9 | 4.65 | ||
| 5.9 | 5.7 | 4.85 | 5.15 | 5.3 | 5.5 | 4.45 | 5.5 | 4.95 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 91.45 | 98.9 | 99.85 | 99.55 | 98.75 | 100 | 99.6 | 99.75 | 99.8 | ||
| 95.95 | 99.85 | 100 | 99.85 | 99.75 | 100 | 99.85 | 100 | 99.95 | ||
| 92.3 | 98.95 | 99.85 | 99.55 | 98.8 | 100 | 99.6 | 99.8 | 99.8 | ||
| 5.15 | 4.9 | 4.45 | 5.29 | 5.05 | 5.1 | 5.35 | 4.75 | 4.7 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 75.45 | 91.45 | 96.7 | 81.25 | 80.55 | 95.45 | 79.95 | 86.1 | 98.1 | ||
| 84.2 | 97.4 | 99.9 | 89.6 | 93.1 | 99.85 | 89.65 | 95.2 | 100 | ||
| 75.95 | 91.6 | 96.8 | 81.85 | 80.8 | 95.45 | 80.4 | 86.25 | 98.1 | ||
| 5.05 | 11 | 5.2 | 4.9 | 4.95 | 4.85 | 4.3 | 4.7 | 4.35 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 60.4 | 59.8 | 74.75 | 43 | 60.25 | 72.9 | 62.7 | 55.55 | 62.6 | ||
| 73.85 | 78.55 | 91.35 | 55.35 | 78.35 | 91.75 | 75.9 | 73.85 | 84.7 | ||
| 60.2 | 59.8 | 74.75 | 42.8 | 60.2 | 72.9 | 62.6 | 55.5 | 62.45 | ||
| 7.2 | 4.75 | 4.7 | 4.65 | 4.6 | 5.4 | 4.9 | 5.6 | 5.05 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 90.1 | 99.2 | 99.55 | 98.15 | 98.9 | 99.95 | 95.1 | 99.4 | 99.9 | ||
| 94.9 | 99.85 | 100 | 99.4 | 99.85 | 100 | 97.55 | 99.85 | 100 | ||
| 90.2 | 99.2 | 99.55 | 98.15 | 98.95 | 99.95 | 95.15 | 99.4 | 99.9 | ||
| 5.2 | 7.65 | 4.4 | 31.65 | 4.85 | 4.85 | 4.7 | 5.55 | 4.8 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 80.7 | 91.9 | 96.25 | 79.2 | 87.3 | 95.65 | 78.3 | 91.85 | 96.55 | ||
| 88.45 | 98.1 | 99.65 | 88.85 | 95.85 | 99.55 | 87.6 | 97.95 | 99.65 | ||
| 79.55 | 91.6 | 96.2 | 78.5 | 87 | 95.5 | 76.8 | 91.4 | 96.55 | ||
| 4.1 | 5.2 | 4.6 | 4.05 | 5 | 4.35 | 4 | 5.2 | 5.35 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 48.15 | 70.95 | 70.15 | 49.15 | 60.8 | 68.6 | 47.4 | 60.1 | 65.7 | ||
| 60.1 | 87.5 | 91.05 | 60.75 | 78.6 | 89.1 | 58.65 | 78.25 | 86.3 | ||
| 43.35 | 68.3 | 68.75 | 43.65 | 58.25 | 67.1 | 42.65 | 56.75 | 63.85 | ||
| 5.25 | 4.7 | 4.9 | 4.55 | 5.05 | 4.95 | 7.55 | 5.3 | 4.3 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 20.4 | 28.85 | 43.95 | 20.55 | 25.9 | 39.85 | 20.25 | 20.3 | 27.7 | ||
| 18.35 | 30.65 | 52.25 | 17.2 | 28.15 | 45.6 | 18.75 | 21.05 | 31.25 | ||
| 21.5 | 29.25 | 44.45 | 21.55 | 26.75 | 40.25 | 21.6 | 21 | 28 | ||
| 7.45 | 6.75 | 6.65 | 7.15 | 7.8 | 6.9 | 7.4 | 6.85 | 6.5 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 23.6 | 13.6 | 38.35 | 14.05 | 18.95 | 39.55 | 11.9 | 23.75 | 31.3 | ||
| 25.6 | 13.8 | 49.3 | 14.15 | 20.5 | 50.5 | 11.95 | 25.85 | 38 | ||
| 24.2 | 13.65 | 38.45 | 14.75 | 19.2 | 39.7 | 12.45 | 24.05 | 31.4 | ||
| 7.9 | 6 | 7.35 | 7.35 | 6 | 7.25 | 6.8 | 5.9 | 6.4 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 12.2 | 11.85 | 27.9 | 10.45 | 18.4 | 38.45 | 11.8 | 19.75 | 54.15 | ||
| 13.4 | 12.5 | 35.8 | 10.8 | 20.35 | 52.75 | 12.65 | 21.9 | 72.15 | ||
| 12.1 | 11.8 | 27.8 | 10.4 | 18.35 | 38.3 | 11.7 | 19.7 | 54.1 | ||
| 6.55 | 5.45 | 6.55 | 6.65 | 5.9 | 7.1 | 6.2 | 6 | 6.9 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 10.45 | 24.3 | 26.4 | 9.15 | 27.25 | 27 | 17.5 | 51.7 | 28.5 | ||
| 10 | 27.85 | 30.95 | 9.65 | 28.65 | 28.9 | 15.85 | 58.05 | 30.35 | ||
| 10.5 | 24.35 | 26.4 | 9.3 | 27.6 | 27.05 | 17.55 | 51.95 | 28.5 | ||
| 6.25 | 7.75 | 5.95 | 7.1 | 7.75 | 7 | 7.1 | 9.1 | 6.75 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 16.5 | 29.9 | 16.45 | 15 | 13.35 | 36.35 | 16.85 | 9.5 | 50.3 | ||
| 16.45 | 34 | 18.8 | 15.05 | 13.55 | 46.5 | 17.15 | 10.35 | 64.95 | ||
| 15.35 | 28.75 | 16.1 | 13.65 | 12.9 | 35.95 | 15.6 | 9.05 | 49.9 | ||
| 7 | 6.95 | 6.2 | 7.4 | 7 | 6.9 | 7.7 | 5.75 | 7 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 14.3 | 18.2 | 55.95 | 11.15 | 15.55 | 47.65 | 8.4 | 15.85 | 77.55 | ||
| 14.1 | 19.85 | 75.55 | 11.1 | 15.6 | 64.25 | 9.1 | 17.1 | 93.3 | ||
| 10.85 | 16 | 54.4 | 8.25 | 13.7 | 46.35 | 6.2 | 13.75 | 75.85 | ||
| 5.1 | 6.15 | 7.5 | 5.4 | 6.1 | 7.25 | 5.1 | 6.25 | 7.6 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 99.3 | 100 | 100 | 98.95 | 99.9 | 100 | 99.15 | 97.85 | 100 | ||
| 99.9 | 100 | 100 | 99.6 | 100 | 100 | 99.75 | 99.4 | 100 | ||
| 99.55 | 100 | 100 | 99.1 | 99.9 | 100 | 99.2 | 97.9 | 100 | ||
| 54.1 | 71.6 | 96.9 | 55.7 | 66.5 | 96.8 | 54.45 | 46.55 | 95.15 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 99.9 | 100 | 100 | 91.45 | 100 | 100 | 75.6 | 100 | 100 | ||
| 100 | 100 | 100 | 96.7 | 100 | 100 | 84.95 | 100 | 100 | ||
| 99.9 | 100 | 100 | 91.6 | 100 | 100 | 76.05 | 100 | 100 | ||
| 59.8 | 68.3 | 97.85 | 43.1 | 76.85 | 97.6 | 33.9 | 82.45 | 98.2 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 99.8 | 100 | 100 | 97.95 | 100 | 100 | 97.45 | 100 | 100 | ||
| 100 | 100 | 100 | 99.65 | 100 | 100 | 99.45 | 100 | 100 | ||
| 99.75 | 100 | 100 | 97.85 | 100 | 100 | 97.45 | 100 | 100 | ||
| 60.85 | 79.15 | 96.9 | 51 | 80.2 | 94.75 | 49.9 | 78 | 98.6 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 99.4 | 99.95 | 100 | 86.9 | 99.95 | 100 | 96.05 | 100 | 100 | ||
| 99.7 | 100 | 100 | 92.65 | 100 | 100 | 98.35 | 100 | 100 | ||
| 99.4 | 99.95 | 100 | 87 | 99.95 | 100 | 96.1 | 100 | 100 | ||
| 56.55 | 64.8 | 95.15 | 38.7 | 62.5 | 95.75 | 46.15 | 87.2 | 93.05 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 87.8 | 100 | 100 | 78.65 | 100 | 100 | 94.85 | 99.75 | 100 | ||
| 94.45 | 100 | 100 | 87.75 | 100 | 100 | 98.4 | 100 | 100 | ||
| 86.45 | 100 | 100 | 77.65 | 100 | 100 | 94.15 | 99.75 | 100 | ||
| 39.7 | 75.2 | 96.85 | 34.5 | 71 | 93.6 | 45.25 | 58 | 96.35 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 97.4 | 100 | 100 | 95.25 | 100 | 100 | 91.15 | 100 | 100 | ||
| 99.45 | 100 | 100 | 98.6 | 100 | 100 | 96.65 | 100 | 100 | ||
| 96.55 | 100 | 100 | 93.45 | 100 | 100 | 88.4 | 100 | 100 | ||
| 46.15 | 74.4 | 98.4 | 44.15 | 74.9 | 93.3 | 40.75 | 80.35 | 97.75 | ||
| Weakly exogenous | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 5.85 | 4.45 | 4.75 | 5.1 | 4.7 | 5.55 | 4.7 | 5.2 | 5.8 | ||
| 5.15 | 4.15 | 4.9 | 4.75 | 4.9 | 6.4 | 5.3 | 5.05 | 5.6 | ||
| 9.25 | 8.4 | 8.6 | 8.8 | 8.8 | 9.2 | 8.95 | 8.85 | 9.2 | ||
| 4.65 | 4.25 | 4 | 5 | 4.5 | 5.05 | 5.6 | 4.4 | 5.35 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 4.55 | 5.4 | 5.8 | 5.65 | 4.95 | 5.05 | 5.55 | 5 | 5.9 | ||
| 5.2 | 5.05 | 5.15 | 5.6 | 4.95 | 4.9 | 5.85 | 4.9 | 5.05 | ||
| 13.15 | 13.6 | 14 | 13.65 | 13 | 12.6 | 13.75 | 13.4 | 12.6 | ||
| 5.6 | 4.35 | 5.2 | 6.25 | 5.2 | 5.6 | 6.1 | 5.2 | 6.2 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 5.85 | 6.9 | 5.3 | 5.05 | 5.45 | 4.8 | 4.4 | 5.2 | 5.2 | ||
| 5.05 | 5.75 | 5.1 | 5.5 | 5.75 | 4.7 | 5.7 | 5.2 | 5.6 | ||
| 28.2 | 27.25 | 27.3 | 27.35 | 28.25 | 26.3 | 30.35 | 28.5 | 26.7 | ||
| 4.5 | 5.1 | 5.1 | 5.7 | 5.05 | 5.65 | 4.85 | 5.1 | 5.9 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 5.7 | 4.7 | 5.05 | 5.1 | 5.45 | 5.55 | 5 | 5.1 | 4.95 | ||
| 5.55 | 4.85 | 4.95 | 4.9 | 4.8 | 5.6 | 5.2 | 4.7 | 4.6 | ||
| 6.4 | 5.15 | 5.1 | 5.6 | 5.75 | 5.65 | 5.7 | 5.4 | 5.1 | ||
| 4.5 | 4.7 | 5.05 | 4.8 | 4.45 | 5.3 | 5.1 | 4.45 | 4.85 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 4.85 | 5.25 | 4.6 | 5 | 5 | 4.7 | 5.05 | 4.35 | 5.6 | ||
| 4.9 | 4.9 | 4.75 | 5.25 | 4.7 | 5.05 | 5.5 | 4 | 5.65 | ||
| 5.45 | 5.4 | 4.7 | 5.2 | 5.05 | 4.95 | 5.4 | 4.55 | 5.65 | ||
| 5.2 | 4.75 | 6.2 | 5.45 | 5.5 | 5.7 | 5.55 | 5.6 | 5.5 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 6.45 | 6.35 | 4.35 | 5.05 | 5.3 | 4.7 | 4.65 | 5 | 4.3 | ||
| 6.55 | 5.9 | 4.75 | 5.65 | 5.4 | 5.2 | 5 | 5.1 | 4.55 | ||
| 6.45 | 6.35 | 4.35 | 5.15 | 5.35 | 4.7 | 4.7 | 5 | 4.3 | ||
| 4.85 | 5.95 | 4.9 | 4.95 | 5.4 | 5.55 | 4.25 | 5.45 | 5 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 5.1 | 6.05 | 4.7 | 4.7 | 4.65 | 5.35 | 5.05 | 4.45 | 5.45 | ||
| 4.95 | 5.05 | 5.05 | 4.45 | 4.25 | 5.05 | 4.2 | 4.85 | 5.25 | ||
| 5.4 | 6.4 | 4.75 | 5 | 4.7 | 5.35 | 5.35 | 4.5 | 5.45 | ||
| 4.95 | 6.1 | 4.6 | 4.65 | 5.2 | 5.25 | 4.35 | 5.05 | 5.05 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 5.75 | 5.15 | 4.75 | 4.8 | 5.25 | 4.75 | 5 | 5.8 | 5.3 | ||
| 5.7 | 5.05 | 4.85 | 4.7 | 5.7 | 4.75 | 5.2 | 5.45 | 5.2 | ||
| 5.4 | 4.8 | 4.55 | 4.2 | 5.05 | 4.6 | 4.5 | 5.55 | 5.25 | ||
| 5.05 | 4.9 | 5.65 | 4.95 | 4.85 | 5 | 5.3 | 5.35 | 4.9 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 5.35 | 6.15 | 5.1 | 6.05 | 5.4 | 4.75 | 4.9 | 5.3 | 4.9 | ||
| 5.65 | 5.8 | 4.95 | 5.55 | 5.1 | 5.05 | 5.55 | 5.1 | 5.1 | ||
| 4.15 | 5.25 | 4.85 | 4.55 | 4.3 | 4.4 | 3.95 | 4.9 | 4.4 | ||
| 5 | 4.9 | 3.75 | 5.15 | 5.15 | 5.15 | 5.3 | 5.4 | 4.7 | ||
| Dynamic | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Chi-squared | Normal | Student-t | ||||||||
| (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | (50,25) | (100,50) | (200,100) | ||
| 6.3 | 5.25 | 5.1 | 5.05 | 5.5 | 5.25 | 5.4 | 4.75 | 4.85 | ||
| 5.8 | 5.15 | 4.45 | 4.7 | 5.3 | 4.9 | 5.4 | 4.7 | 4.9 | ||
| 6.75 | 5.45 | 5.1 | 5.45 | 5.6 | 5.3 | 5.85 | 5.15 | 4.95 | ||
| 4.9 | 5.3 | 5.4 | 4.35 | 5 | 5.45 | 4.35 | 5.8 | 5.15 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | (50,50) | (100,100) | (200,200) | ||
| 6.65 | 5.25 | 5.95 | 5.8 | 5.35 | 4.6 | 6.2 | 5.3 | 5 | ||
| 5.7 | 5.35 | 5.4 | 6.1 | 5.45 | 4.45 | 6.05 | 5.5 | 4.6 | ||
| 6.7 | 5.3 | 6 | 5.9 | 5.35 | 4.6 | 6.35 | 5.4 | 5 | ||
| 4.75 | 4.85 | 4.85 | 4.85 | 4.6 | 4.25 | 4.8 | 4.8 | 5.5 | ||
| Chi-squared | Normal | Student-t | ||||||||
| (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | (50,100) | (100,200) | (200,400) | ||
| 5.7 | 5.9 | 5.65 | 6.2 | 5.1 | 5.35 | 5.95 | 5.5 | 5.45 | ||
| 6.45 | 6.05 | 5.1 | 5.4 | 5.45 | 5.05 | 5.05 | 5.3 | 5.15 | ||
| 5.3 | 5.7 | 5.6 | 5.75 | 5 | 5.3 | 5.6 | 5.35 | 5.3 | ||
| 4.4 | 6 | 4.55 | 5.7 | 5.75 | 5.05 | 5 | 4.95 | 5.6 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial and Panel Data Analysis · Global trade and economics · Economic Growth and Productivity
MethodsTest
Unified and robust Lagrange multiplier type tests for cross-sectional independence in large panel data models
Zhenhong Huang111 Department of Statistics and Actuarial Science, The University of Hong Kong. Email: [email protected], Zhaoyuan Li222School of Data Science, The Chinese University of Hong Kong, Shenzhen. Email: [email protected] and Jianfeng Yao333School of Data Science, The Chinese University of Hong Kong (Shenzhen). Email: [email protected]
Abstract
This paper revisits the Lagrange multiplier type test for the null hypothesis of no cross-sectional dependence in large panel data models. We propose a unified test procedure and its power enhancement version, which show robustness for a wide class of panel model contexts. Specifically, the two procedures are applicable to both heterogeneous and fixed effects panel data models with the presence of weakly exogenous as well as lagged dependent regressors, allowing for a general form of non-normal error distribution. With the tools from Random Matrix Theory, the asymptotic validity of the test procedures is established under the simultaneous limit scheme where the number of time periods () and the number of cross-sectional units () go to infinity proportionally. The derived theories are accompanied by detailed Monte Carlo experiments, which confirm the robustness of the two tests and also suggest the validity of the power enhancement technique.
Keywords: Cross-sectional dependence, Large panels, Coefficient heterogeneity, Weak exogenous, Random Matrix Theory
1 Introduction
In panel data analysis, the problem of error cross-sectional dependence has attracted substantial attention in recent years. This cross-sectional dependence can arise for various reasons, such as omitted common or spatial effects. Ignoring error cross-sectional dependence can have dramatic effects on conventional panel estimators, e.g., the least squares, fixed and random effect estimators, and yield invalid inferential procedures such as commonly used panel unit root tests, where these tests assume cross-sectional independence. Therefore, designing efficacious tests for cross-sectional dependence is essential in panel data analysis.
There has been much work on testing for cross-sectional dependence in the literature. Breusch and Pagan (1980) proposed a Lagrange multiplier () test based on the squared pair-wise Pearson correlation coefficients of the residuals. Under the null hypothesis of no cross-sectional dependence, the test is asymptotically chi-squared distributed with and fixed. However, it is not applicable for large , which renders its popularity considering recent researches have focused on the large panels where both and can be large. In such high dimensional setting, there are two mainstream schemes considered by statisticians and econometricians, being known as sequential limit scheme and simultaneous limit scheme defined as following:
[TABLE]
and
[TABLE]
respectively. Under the SIM-L scheme, Frees (1995) proposed a distribution free type test allowing for large , , based on the squared pair-wise Spearman rank correlation coefficients, which is asymptotically distributed as Chi-squared. The test has however imposed limitations on the number of regressors and could be oversized for small . Pesaran (2004) suggested a scaled version of the test, denoted by , and showed its asymptotic property. The author however pointed out that the test is not correctly centered at zero for small , and is likely to exhibit large size distortions as increases. Pesaran (2004) also proposed an alternative approach, the test, which employs pair-wise Pearson correlation coefficients of the residuals as well, but without squaring them. This test has universally correct size under a broad course of panel data model designs, but it lacks power when correlation coefficients within panel units have variable signs leading to certain cancellation effect. Particularly, this happens when the errors are generated from a factor model where the loadings average to zero. Pesaran (2015) extended the test to the scenario of weak cross-sectional dependence. Pesaran et al. (2008) put forward another approach, , by deriving the exact expected values and variances of the squared correlation coefficients under the assumption of normally distributed errors and strictly exogenous regressors. By applying the classical central limit theory, it is shown that the test converges to standard normal distribution under the SEQ-L scheme. Bailey et al. (2021) proposed a new type test, , and proved its asymptotic normality under the SIM-L scheme using Random Matrix Theory. However, they require the assumptions of normal regressors and normal errors. In a slope homogeneity setting, Baltagi et al. (2012) analyzed the performance of the test in the fixed effects panel data model, then presented a bias corrected test, , and established its asymptotic normality under the SIM-L scheme assuming normal errors and strictly exogenous regressors. Baltagi et al. (2012) also showed that the test can be applied to dynamic panel data model with fixed effects, using the within estimator proposed by Hahn and Kuersteiner (2002). However, the slope homogeneity restriction has often been rejected in empirical analyses, see a detailed survey by Baltagi et al. (2008). Meanwhile, there are limited tests designed for the dynamic panel data model. Examples include the GMM approach of Sarafidis et al. (2009) applied to panels with homogeneous coefficients under factor model representation, and a heteroskedasticity robust test, , of Halunga et al. (2017) with required.
We make two distinct contributions in this paper. First, we propose a type test statistic that can be applied to a wide class of linear panel models. Specifically, we show that our test is robust to both static and dynamic heterogeneous panel data models with weakly exogenous regressors and non-normal errors. With tools from Random Matrix Theory, we treat sample correlation matrix of residuals directly as a random matrix to establish the asymptotic normality of the test statistics under the SIM-L scheme, which has been suggested to be a more reliable strategy when dealing with high-dimensional statistical problems, see Yao et al. (2015). We also show that the proposed test is mathematically equivalent to both the and the test. This finding theoretically enriches the and test by relaxing the restrictive assumptions they need. It is worth mentioning that the existing literature on testing for cross-sectional dependence has mostly focused on the case of strictly exogenous regressors, including the , and tests. Though Pesaran (2015) showed that the test is also applicable to autoregressive panel data models so long as the errors are symmetrically distributed, the properties of the test for dynamic panels that include weakly exogenous regressors have not yet been investigated. The new proposed test fills this gap for the weakly exogenous regressors case, and to the best of our knowledge, there is no such a unified test so far.
Second, weak cross-sectional dependence is common in empirical applications, see, for example, Bailey et al. (2016), Ertur and Musolesi (2017). This leads to a sparse correlation structure with few nonzero off-diagonal entries. Therefore, it is important to test the existence of such weak cross-sectional dependence, which corresponds to sparse alternatives in the high-dimensional statistics literature. The mainstream of more powerful tests for sparse alternatives are based on the maxima absolute value of sample correlations, see Cai et al. (2011) and Hall et al. (2010). However, these tests require stringent conditions that are ‘unfeasible’ in econometric applications and often suffer from size distortions due to slow rates of convergence. In view of this, we propose a novel and easy-implemented test statistic for cross-sectional dependence based on the fourth moment of sample correlation to boost the power of detecting sparse alternatives. Again, this test can be used in any aforementioned panel setting under suitable conditions on higher moments of the errors.
The remainder of the paper is organized as follows. Section 2 discusses the existing tests for no cross-sectional dependence. Section 3 introduces the new test statistics and establishes the limiting distributions under the SIM-L scheme. Sections 4 and 5 demonstrate that the proposed tests can be extended to both the dynamic and fixed effects panel data models. Section 6 reports the results of Monte Carlo simulations. Section 7 provides concluding remarks and further discussions.
Notations. Throughout the paper, for a matrix , represents the trace . We use to denote eigenvalues of . Further, for vectors , we write for their scalar product. In addition. represents the Euclidean norm for a vector and the induced operator norm for a matrix.
2 Existing tests for cross-sectional dependence based on sample correlation
Consider the heterogeneous panel data model:
[TABLE]
where indexes the cross-sectional units and the time series observations, is the response variable and is a vector of regressors with unity on the first row with coefficients allowed to vary cross the cross-sectional units. For each , the error term, , are assumed to be serially independent with zero mean and finite variance. The null hypothesis of interest in the literature is
[TABLE]
When is sufficiently large, a natural way to test is based on some reliable estimates () for the pair-wise error sample correlations (). Specifically,
[TABLE]
where is the Ordinary Least Squares (OLS) estimate of in (1) defined by
[TABLE]
with being the OLS estimate of by regressing the sample observations on for each . In the seemingly unrelated regression equations (SURE) context with fixed and as , Breusch and Pagan (1980) proposed a Lagrange multiplier () test for testing given by
[TABLE]
where is the sample correlation matrix of residuals. It has been shown that , where under and normal errors assumption. However, it is well known that the test is severely oversized when is relatively large compared to . To amend this size distortion, Pesaran (2004) put forward a scaled version of the test given by
[TABLE]
which is asymptotically distributed as under the SEQ-L scheme. There are two important cases where the test is not reliable. Firstly, as Baltagi et al. (2012) noted, it will exhibit substantial size distortions in the homogeneous panel data model. Secondly, Pesaran (2004) pointed out that, in finite case, the test tends to over-reject the null due to the fact that is not correctly centered at zero. This kind of bias even accumulates as becomes larger. In this case, Dufour and Khalaf (2002) suggested to apply bootstrap method to (5). Pesaran (2004) and Pesaran (2015) proposed an alternative adjustment based on the raw, non-squared, sample correlation coefficients given by
[TABLE]
The test is asymptotically distributed as standard normal under both SEQ-L and SIM-L schemes. However, it is widely reported that the test suffers from a specific loss of power when the loadings have zero mean in the cross-sectional dimension under factor representation. Baltagi et al. (2012) proposed another modified version of the test for fixed effect panel data model, , given by
[TABLE]
The test is asymptotically standard normal with normal errors and strictly exogenous regressors under the SIM-L scheme. Pesaran et al. (2008) proposed an alternative finite sample adjustment to the test by deriving the exact moments of the squared sample correlation coefficients under normal errors and strictly exogenous regressors assumptions. Their test statistic is given by
[TABLE]
where
[TABLE]
and
[TABLE]
is the projection matrix, where contains samples on the regressors for the -th individual regression. Under and the SEQ-L scheme, was shown to be asymptotically distributed as However, as pointed out by Pesaran et al. (2008), the test is not robust in panel data models with weakly exogenous regressors. Bailey et al. (2021) proposed another modified test for heterogenous panel data models based on Random Matrix Theory, , given by
[TABLE]
where
[TABLE]
with . Under the assumptions of normal regressors and normal errors, the authors showed that is asymptotically distributed as under the SIM-L scheme. The application scopes of the discussed tests are summarized in Table 1. (The table also contains the new tests proposed in this paper in the last two columns so-called and , which are developed later.)
3 The RLM test and its power enhancement
3.1 The RLM test
Motivated by the existing well-known tests based on the sum of squared sample correlation coefficients in (4), (5), (7), (8) and (9), it is natural to consider the limiting behavior of under the SIM-L scheme. Throughout the paper, we consider the following assumptions.
Assumption 1**.**
such that .
Assumption 2**.**
For each , the errors, , are i.i.d distributed with mean [math] and variance .
Assumption 3**.**
- (i)
The errors have uniformly bounded sixth moment, i.e.
for some positive constant and . 2. (ii)
The errors have uniformly bounded eighth moment, i.e. for some positive constant and .
For a static heterogeneous panel data model, we further assume
Assumption 4**.**
For each , the regressors, , satisfy
- (i)
for all and . 2. (ii)
let , there exists a nonrandom positive definite matrix such that . 3. (iii)
.
Assumption 2 is standard allowing for heteroskedastic errors across units. Assumption 3 requires suitable moments of the errors for the two proposed test procedures, respectively. It helps relax the often-met normal error assumption by Random Matrix Theory. Assumption 4(i) only requires the regressors to be weakly exogenous. Assumption 4(ii) and (iii) impose mild conditions on the design matrix. We note that Assumption 4 does not impose the dependence structure between errors and regressors, which allows for the regressors to be weakly exogenous. Under these assumptions, is asymptotically normal according to Lai and Wei (1982).
For a dynamic heterogeneous panel data model with lagged dependent variable included in regressors, more assumptions are needed which will be discussed in Section 4.
Now we are in the position of introducing the test and establishing its asymptotic property in the following theorem.
Theorem 1**.**
Under Assumptions 1, 2, 3 and 4,
[TABLE]
where and
The proof of Theorem 1 is provided in the Appendix, and the method is in two stages. In the first stage, Lemma 1 establishes the Central Limit Theorem of with tools from Random Matrix Theory, where . In the second stage, Lemma 2 shows that the asymptotic bias of disappears under the SIM-L scheme with .
3.1.1 Relationship between the and tests
By the respective definitions of the and in (9) and Theorem 1, we have
[TABLE]
and
[TABLE]
as . It follows that
[TABLE]
which indicates that is asymptotically equivalent to regardless of model specifications and assumptions. Note that the proof the asymptotic normality of in Bailey et al. (2021) heavily relies on the assumptions of normal regressors and normal errors as it is used to ensure the residuals have desirable properties, and then transform the sample correlation matrix of residuals to the sample correlation matrix of a nomarlized population with unit covariance matrix (see details in Section 3.1 in Bailey et al. (2021)). From (11), we conclude that the is also valid without the restrictive assumptions of normality. Besides, as we will show later, is also valid in both dynamic and fixed effects panel data models, which theoretically extends the application scope of . This finding is consistent with the simulation findings that show such robustness of in Bailey et al. (2021).
3.1.2 Relationship of the , and tests
By the respective definitions of the , and tests in (5), (7) and Theorem 1, we have the following identities
[TABLE]
[TABLE]
and
[TABLE]
Note that the factor and the remainder . It follows that the two tests, and , are always asymptotically equivalent, while the statistic has always a positive mean shift of value .
In particular, Theorem 1 is also valid for the statistic. Moreover, anticipating Theorems 3 and 5 in Sections 4 and 5, for dynamic and fixed effects panel data model, respectively, these asymptotic normality are also valid for the statistic. In this sense, the results from the paper can also be considered as new extension of the test, originally developed for homogeneous fixed effects panel data model in Baltagi et al. (2012), to various large panel models with coefficient heterogeneity.
3.2 The test
In the high dimensional setting, for testing the identity hypothesis , where , there are mainly two types of test statistics. The majority of existing tests are based on the squared Frobenius norm However, this quadratic statistic lacks power if is a sparse matrix, see Fan et al. (2015). Considering this, tests based on the maxima of absolute values, , which share a asymptotic type I extreme value distribution, are generally powerful under sparse alternatives. This approach has however a main drawback that such test can suffer from size distortions, which is common for statistics of the maximum type, see Liu et al. (2008). Besides, this way is not as appropriate as the Frobenius norm (sum) type in some cases. For example, consider the alternative
[TABLE]
where is a perturbation matrix with diagonal entries being zero and non-zero off diagonal entries, where . Intuitively, can be designed as a dense matrix but with weak coefficients such that for any . Consequently, the extreme value type tests will fail to detect such a matrix. In such instances, the sum type tests are more suitable in the light of the fact that the eigenvalues of could vary from to , which results in larger quadratic statistic value by . In order to realize an interpolation of the two types of statistics above, namely the maximum type and the sum type, we propose a new test statistic based on . The reason is that large empirical correlations, , would be more emphasized in than in . To see this, consider increasingly large powers of the sample correlations, , where is a positive integer. Let , then
[TABLE]
where denotes the cardinality of the set . Therefore, the new statistic with can mimic some properties of the maximum type, while remaining a sum type smoothing statistic. The resulting power is expected to be higher than when very few sample correlations are significantly non-zero under sparse alternatives, and higher than maximum type statistics when there are many but relatively small correlations.
3.2.1 Test based on
On the ground of analyses above, we propose a new test statistic based on the fourth power of in the following theorem.
Theorem 2**.**
Under Assumptions 1, 2, 3 and 4,
[TABLE]
where and
Remark 1**.**
We choose to generate the test for technical simplicity. In fact, one can increase to any large even integer to obtain new tests that may have larger power in the sparse correlation setting. This strategy is feasible with the proof techniques provided in Appendix. Further, by (15), it is expected that tests based on would share similar power with maximum type statistics, which has been suggested as a powerful tests in sparse data, for example, see Cai et al. (2014). It can provide a series of potential statistics that can well control the size and may be more powerful under sparse alternatives at the same time.
Remark 2**.**
For the initial test (also the test), the remarkable screening technique in Fan et al. (2015) can provide an improved test that has the same asymptotic size with non-inferior asymptotic power against a broader range of alternatives. Compared to this approach, our power enhanced test avoids constructing such a “power enhancement component” by increasing the power of the sample correlation to four. However, studying the power properties of our technique with different choices of is not the main focus in this paper and remains an open problem.
The proof of the Theorem 2 is similar to that of Theorem 1, which requires the two lemmas given in Appendix.
4 Dynamic panel data model
In this section, we show that the and tests are asymptotically valid in a dynamic panel data model, which is specified as following:
[TABLE]
for , where is the lagged dependent variable. Let , , then (16) can be rewritten as . We show that the proposed and tests still have standard normal limiting distribution under the null hypothesis in the dynamic panel data model. To establish the asymptotic normality, we need additional assumptions as following,
Assumption 5**.**
- (i)
is a stationary and ergodic process. 2. (ii)
Let , holds uniformly in .
We establish the limiting distributions of the proposed tests in the following theorems
Theorem 3**.**
Under Assumptions 1, 2, 3, 4 and 5,
[TABLE]
Theorem 4**.**
Under Assumptions 1, 2, 3, 4 and 5,
[TABLE]
Under Assumption 5, the proofs of Theorems 3 and 4 follow along the same lines as that of static panel data model. See the Appendix.
5 Fixed effects panel data model
In this section, we establish the asymptotic normality of the and tests in a fixed effects panel data model. We find that as long as the coefficient estimator is -consistent, the proposed tests still have standard normal limiting distribution under the null. To allow for weakly exogenous regressors, various consistent estimators have been proposed in the literature including Chudik and Pesaran (2015), Chudik et al. (2018) etc. However, these estimators require stronger assumptions than the static panel data model. For simplicity of illustration, we focus on residuals obtained by the within estimator. The strictly exogenous assumption is then necessary for the consistency of the within estimator. One can relax this assumption to the weakly exogenous one by applying a -consistent estimator.
Consider a fixed effects panel data model:
[TABLE]
for , where denotes the time-invariant individual effect. The within estimator in (17) is specified by
[TABLE]
where and .
Assumption 6**.**
The regressors, , satisfy
- (i)
(strictly exogenous) and for all and . 2. (ii)
For the demeaned regressors , and are stochastic bounded for all . Besides, exists and is nonsingular.
Under the Assumptions 1, 2, 3 and 6, is consistent. We establish the validity of our proposed tests in the following theorems.
Theorem 5**.**
Under Assumptions 1, 2, 3 and 6,
[TABLE]
Theorem 6**.**
Under Assumptions 1, 2, 3 and 6,
[TABLE]
The proofs of Theorems 5 and 6 are given in the Appendix.
6 Monte Carlo simulations
In this section, we conduct Monte Carlo simulations to examine the empirical sizes and powers of our and tests, which are defined by (1) and (2), respectively, and compare their performances to that of the test and the test defined by (6) and (8), respectively. We consider four data generating processes (DGPs): heterogeneous panel data model with either strictly or weakly exogenous regressors, fixed effects panel data model and pure dynamic panel data model.
Before looking at the simulation results, we consider the estimated rejection frequencies within range from 3.6% to 6.5% to provide evidence consistent with the robustness of the tests, following the arguments in Halunga et al. (2017). Besides, we don’t include the and the tests since they are almost identical to the test by (11) and (14).
6.1 Monte Carlo design
6.1.1 DGP1: Heterogeneous panel data model with strictly exogenous regressors
We first consider the DGP used in Pesaran et al. (2008), which is specified by
[TABLE]
where , . The regressors are generated as
[TABLE]
with where , . The first 50 observations are discarded to lessen the effects of initial values. Now we generate the disturbances under the null as , where and are generated from three different distributions: (i) normal, , (ii) chi-squared, and (iii) student-t, . The normalizations in (ii) and (iii) are such that errors have mean one and variance one. To investigate the effects of the number of regressors, are considered.
To examine the powers of the proposed tests, the disturbances are generated by a factor model as following:
[TABLE]
where are the factors with and are the loadings. We consider the following three cases of loading construction:
- (1)
Dense case. , for , where and 2. (2)
Sparse case. , for , and , for , where is the integer part of . 3. (3)
Less-sparse case. , for , and , for .
In the dense case, measures the degree of cross-sectional dependence. The sparse case and the less-sparse case follow the design used in Bailey et al. (2016) to model the weak and strong cross-sectional dependence, respectively. The Monte Carlo experiments are conducted for , and three different choices of ratio basing on 2000 replications. To obtain the empirical size, the proposed test, test and test are implemented at the one-sided 5% nominal significance level, while test is conducted at the two-sided 5% nominal significance level.
6.1.2 DGP2: Heterogeneous panel data model with weakly exogenous regressors
To investigate the performances of the and tests in panel data models with weakly exogenous regressors, we consider the following DGP:
[TABLE]
where , , and
[TABLE]
[TABLE]
with where , . This set up allows for feedback from to the regressors, thus rendering weakly exogenous. The errors, , are generated in the same way as DGP1.
6.1.3 DGP3: Fixed effects panel model
The third DGP considered is a fixed effects panel data model with homogeneous coefficients, which is specified as
[TABLE]
where and are set arbitrarily to 1 and , respectively, . The regressors and errors are generated in the same way as DGP1.
6.1.4 DGP4: Dynamic panel data model
To examine the properties of the and tests in a dynamic panel data model, we follow the design of Pesaran et al. (2008):
[TABLE]
with , where , and the fixed effects, , are drawn as , with . The errors, , are generated in the same way as DGP1.
6.2 Simulation results
Table 2 reports the empirical size of these tests for the DGP1. The proposed and tests successfully control the size under almost all settings, irrespective of number of regressors included in the panel data model444However, for small sample size with more regressors, i.e. and , the and tests would be slightly oversized. For example, the empirical sizes of and are 7.65 and 7.4 under normal errors, respectively. . For a fixed ratio , the empirical size of the and tests converge to the nominal size of as , that authenticates the asymptotic normality of the tests under the SIM-L scheme. Besides, the performance of are almost identical to . The test has correct size in all cases.
Table 3 demonstrates the empirical power of these tests under the alternative with dense factors. The test has comparable power to regardless of combinations and error distributions. In contrast, the test suffers from little power by construction, where mean of factor loading is close to zero as mentioned by Pesaran et al. (2008). The power enhancement version of , the test, outperforms others across the board, especially when . For example, the power of the test is 78.25% for and student-t errors, whereas the power of the and tests are 60.1% and 56.75%, respectively. It improves the power by up to around 30%. Besides, the power of the test is 69.2% for and chi-square errors, and the power of the and tests are 51.2% and 51.1%, respectively. These results indicate that the test successfully boost the power.
The empirical power of these tests under the alternative with sparse and less sparse factors are summarized in Tables 4 and 5, respectively. The test again has similar performance to . The empirical power of show that it performs the best among those tests. The power of the test floats around 5% throughout as in Pesaran (2015).
For the heterogeneous panel data model with weakly exogenous regressors, Table 6 shows that becomes considerably oversized, especially for the case , where it has size around 28%. This shows that the test is not robust to weakly exogenous regressors, which is also observed in Bailey et al. (2021). However, our proposed test and tests control the size well, which are not sensitive to the strictly exogenous assumptions on regressors. Therefore, though it is widely reported that the test has generally satisfying empirical performances regardless of restrictive strictly exogenous regressors and normal errors assumptions, the present DGP2 is indeed a rare situation where the test is outperformed by others.
Table 7 reports the size of the tests for the fixed effects panel data model. It shows that the proposed and tests have the correct size, close to the 5% nominal significance level, for example, has 5.1% and 5% size results, respectively for under normal errors and for under chi-squared errors. Similar results for can be also observed in this table. Pesaran’s and tests have correct size in this setting as in Pesaran (2004) and Pesaran et al. (2008).
Finally, Table 8 gives the empirical size of these tests for dynamic panel data model. It shows that the proposed and tests have the correct size, e.g. 5.15% for with chi-squared error, which is comparable to the test. The always has correct size as in Pesaran (2004). The results of empirical power for DGP2, DGP3 and DGP4 are similar to those of DGP1, so we omit it here.
Based on these findings, the test is strongly recommended for practitioners if it is not clear whether weakly exogenous regressors are present or not, given its universally correct size and better power performances. Instead, when regressors are believed to be strictly exogenous, then the test, a easily implemented and computationally cheap procedure, is preferred for large panels (). For , is still applicable though it might be slightly oversized, or the test is a suggested method at the risk of intensive computation.
7 Conclusion
This paper has developed a Lagrange multiplier type test for the null hypothesis of no cross-sectional dependence in large panel models. The procedure can be applied to a wide class of linear panel data models and shows robustness to quite general forms of non-normality in the disturbance distribution. We further proposed a power enhancement version of the type test based on the fourth moment of the sample correlations obtained from residuals to boost power under sparse alternatives, which only requires existence of higher moment but still shares such robustness. The simulations illustrate that this test has satisfactory power under the sparse alternatives of weak cross-sectional dependence, and both of the tests successfully control the size in different data generating processes.
For future work, it is interesting to explore theoretically the power properties of and the optimal that would maximise power. Also, it would be of interest to investigate the performance of the and tests in the weakly cross-sectional dependence framework. In addition, testing the null hypothesis with no cross-sectional dependence when errors are serial dependent will also be studied.
Appendix
This appendix includes the proofs of the following lemmas:
Lemma 1**.**
Under Assumptions 1, 2 and 3,
[TABLE]
Lemma 2**.**
Under Assumptions 1, 2, 3 and 4,
[TABLE]
Lemma 3**.**
Under Assumptions 1, 2 and 3,
[TABLE]
Lemma 4**.**
Under Assumptions 1, 2, 3 and 4,
[TABLE]
In the static heterogeneous panel data model, is the OLS estimator and the residuals are given by . Let , for , consequently, . Define , and . Using this notation, , and the sample covariance matrices can be written as , with elements , respectively.
To accomplish the proof of results above, several lemmas are introduced as following.
Lemma 5**.**
(Theorem 13 of Chapter 13, Petrov (1975)) Let be independent and identically distributed random variables, such that , and for some . Then
[TABLE]
for all , where is the cumulative distribution function of standard normal random variable and is a positive constant depending only on .
Lemma 6**.**
Let be an array of independent and identically distributed random variables such that , and for some . Let , then for any , we have
[TABLE]
Proof.
For some let , we have
[TABLE]
where the first inequality follows by Lemma 5, and the first approximation follows by the fact that for large . Therefore, holds. ∎
Remark 3**.**
For the panel data model, if the errors satisfy the conditions in lemma 5 and , then for any any , the estimate still holds once we further assume that for some positive constants and . It holds naturally since in the SIM-L scheme.
Lemma 7**.**
(Li et al. (2012)) Suppose , then
[TABLE]
when .
Lemma 8**.**
Under Assumptions 1, 2, 3 and 4, for any and some integer , i.e.
- (a)
. 2. (b)
. 3. (c)
. 4. (d)
. 5. (e)
** 6. (f)
.
Proof.
(a). Firstly, we consider the case . By Assumption 2, we obtain Therefore, by Lemma 6 for some and for any ,
[TABLE]
Consequently,
[TABLE]
The calculations for case is similar so we omit it here.
(b). We have
[TABLE]
(c). By CLT, where and , then for some and for any by Lemma 6. It can be easily found that , therefore,
[TABLE]
(d). Note that for , it follows along the same lines as that of (c).
(e). By (a) and (b), we have
[TABLE]
(f). The conclusion holds from (c) and (e).
∎
Proof of Lemma 1
Proof.
By the Theorem 3.1 of Yin et al. (2021), there exist constants , and such that:
[TABLE]
Applying the results in Example 3.2 of Yin et al. (2021) with , we obtain . For the case and , the results in Example 3.3 of Yin et al. (2021) shows that and . Finally, substituting with by Slutsky’s theorem completes the proof. ∎
Proof of Lemma 2
Proof.
By direct calculation we have
[TABLE]
where constant . Therefore, we aim to show that:
- (i)
, 2. (ii)
.
(i) By lemma 7, we have
[TABLE]
Therefore, holds.
(ii) By direct calculation we have
[TABLE]
Define then
[TABLE]
where is a constant only depending on and .
Therefore, if for any holds, then we can conclude that . Further, we only need to consider the case when by equality , i.e. if we can show for any , then immediately holds. By Lemma 8, we show that
[TABLE]
By similar calculations, we conclude that
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Therefore, we can conclude that . ∎
Proof of Lemma 3
Proof.
By the Theorem 3.2 of Yin et al. (2021), there exist constants , and such that:
[TABLE]
Applying the results in Example 3.2 of Yin et al. (2021) with , we obtain . For the case and , the results in Example 3.3 of Yin et al. (2021) shows that and . Finally, substituting with by Slutsky’s theorem completes the proof. ∎
Proof of Lemma 4
Proof.
It is easy to verify that
[TABLE]
where . We only aim to show that
- (i)
, 2. (ii)
, 3. (iii)
, 4. (iv)
since we have by Lemma 2.
(i) By direct calculation we have
[TABLE]
where constant . We show that:
- (i.1)
[TABLE] 2. (i.2)
[TABLE]
(i.1) By lemma 7, we have
[TABLE]
Therefore, holds.
(i.2) By direct calculation, we have
[TABLE]
where
[TABLE]
Consequently,
[TABLE]
where is a constant only depending on and . By the same arguments in the proof of Lemma 2, we only need to show that
[TABLE]
for any . By Lemma 8, for , one can easily show that the stochastic order dominating terms are
[TABLE]
[TABLE]
[TABLE]
[TABLE]
which have the same order O_{p}\Big{(}n^{\frac{10}{r_{1}}+4\epsilon_{1}+8\epsilon_{2}+\alpha_{2}-3}\Big{)}=o_{p}(1). For , stochastic order dominating terms have the same order of
[TABLE]
whose order are . Therefore, we can conclude that .
(ii) By direct calculation we have
[TABLE]
where constant . We show that:
- (ii.1)
[TABLE] 2. (ii.2)
[TABLE]
(ii.1) By lemma 7, we have
[TABLE]
Therefore, holds.
(ii.2) By direct calculation, we have
[TABLE]
where
[TABLE]
Consequently,
[TABLE]
where is a constant only depending on and . By the same arguments in the proof of Lemma 2, we only need to show that
[TABLE]
for any . By Lemma 8, for , one can show that the stochastic order dominating terms are
[TABLE]
[TABLE]
and
[TABLE]
which have the same order . For , stochastic order dominating terms have the same order of
[TABLE]
whose order is . Therefore, we can conclude that .
**(iii)**By direct calculation, we have
[TABLE]
For constant , we show that:
- (iii.1)
[TABLE] 2. (iii.2)
[TABLE]
(iii.1) By lemma 7, we have
[TABLE]
Therefore, holds.
(iii.2) By direct calculation, we have
[TABLE]
where
[TABLE]
Thus
[TABLE]
where is a constant only depending on and . By the same arguments in the proof of Lemma 2, we only need to show that
[TABLE]
for any . By Lemma 8, for , one can show that the stochastic order dominating terms have the same order of
[TABLE]
whose orders are . For , the stochastic order dominating terms have the same order of
[TABLE]
whose orders are . Therefore, we can conclude that
(iv) By direct calculation, we have
[TABLE]
For constant , we show that:
- (iv.1)
[TABLE] 2. (iv.2)
[TABLE]
(iv.1) By lemma 7, we have
[TABLE]
Therefore, holds.
(ii.2) By direct calculation, we have
[TABLE]
where
[TABLE]
Thus
[TABLE]
where is a constant only depending on and . By the same arguments in the proof of Lemma 2, we only need to show that
[TABLE]
for any . By Lemma 8, for , stochastic order dominating terms have the same order of
[TABLE]
whose orders are . For , one can show that the stochastic order dominating terms have the same order of
[TABLE]
whose orders are . Therefore, we can conclude that . Finally, proof of proposition 2 is completed. ∎
Proof of Theorem 3 and 4
For the dynamic panel data model, let , then is the OLS estimator and the residuals are given by . In vector form, . Define , and . Using this notation, . Replacing and with and , respectively, the proofs of Theorem 3 and 4 follow along the same arguments above, that is, we only need to verify that (a) and (b) in Lemma 8 still hold for the dynamic panel data model.
Lemma 9**.**
Under Assumptions 1, 2, 3, 4 and 5, for any and some integer , i.e.
- (a)
. 2. (b)
.
Proof.
(a). Firstly, for the case
[TABLE]
We have for some integer and by Lemma 8, then
[TABLE]
For :
[TABLE]
For : Applying martingale theory, we show that converged to a centered normal distribution. Let , we aim to verify conditions A1 and A2 imposed in Corollary 2.1.10 of Duflo (2013). Firstly, where is the corresponding filtration. Therefore, under Assumption 5(i), so that A1 holds. Note that the Lyapunov condition
[TABLE]
holds under Assumption 5(i), which indicates that A2 holds as well. The assertion follows from Corollary 2.1.10 of Duflo (2013), so that by Lemma 8. Consequently,
[TABLE]
The case is similar.
(b)
[TABLE]
Under Assumption 4(ii) and 5(ii), and , so that ∎
Proofs of Theorem 5 and 6
For the fixed effects panel data model, is the within estimator and the within residuals are given by . Let , , and . Define , , and . Let and . Again, it suffices to verify that Lemma 8 (a) and (b) still hold for the fixed effect panel data model.
Lemma 10**.**
Under Assumptions 1, 2,3 and 4, for any and some integer , i.e.
- (a)
. 2. (b)
.
Proof.
(a) When , by and
[TABLE]
For
[TABLE]
For
[TABLE]
uniformly in since holds uniformly by Assumption 3 and by Assumption 4 and Lemma 8, we have . Therefore . Using same techniques, and , so that .
The calculations for case is similar.
(b)
[TABLE]
For
[TABLE]
Lastly, , and are all by Assumption 4. Therefore,
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bailey et al. (2021) Bailey, N., J. Dandan, and J. Yao (2021). A lagrange-multiplier test for large heterogeneous panel data models. Available at SSRN 3804164 .
- 2Bailey et al. (2016) Bailey, N., G. Kapetanios, and M. H. Pesaran (2016). Exponent of cross-sectional dependence: Estimation and inference. Journal of Applied Econometrics 31 (6), 929–960.
- 3Baltagi et al. (2008) Baltagi, B. H., G. Bresson, and A. Pirotte (2008). To pool or not to pool? In The Econometrics of Panel Data , pp. 517–546. Springer.
- 4Baltagi et al. (2012) Baltagi, B. H., Q. Feng, and C. Kao (2012). A lagrange multiplier test for cross-sectional dependence in a fixed effects panel data model. Journal of Econometrics 170 (1), 164–177.
- 5Breusch and Pagan (1980) Breusch, T. S. and A. R. Pagan (1980). The lagrange multiplier test and its applications to model specification in econometrics. The Review of Economic Studies 47 (1), 239–253.
- 6Cai et al. (2011) Cai, T. T., T. Jiang, et al. (2011). Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. The Annals of Statistics 39 (3), 1496–1525.
- 7Cai et al. (2014) Cai, T. T., W. Liu, and Y. Xia (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology , 349–372.
- 8Chudik and Pesaran (2015) Chudik, A. and M. H. Pesaran (2015). Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. Journal of econometrics 188 (2), 393–420.
