The First-stage F Test with Many Weak Instruments
Zhenhong Huang, Chen Wang, Jianfeng Yao

TL;DR
This paper investigates the limitations of the first-stage F test when used with many weak instruments, revealing size distortions and proposing a corrected alternative for more reliable detection.
Contribution
It demonstrates the failure of the classical F test in high-instrument scenarios and introduces a corrected test to improve weak instrument detection.
Findings
Classical F test exhibits size distortions with many weak instruments.
Distortions are due to improper approximation by noncentral Chi-squared distributions.
A corrected F test improves detection accuracy in empirical applications.
Abstract
A widely adopted approach for detecting weak instruments is to use the first-stage statistic. While this method was developed with a fixed number of instruments, its performance with many instruments remains insufficiently explored. We show that the first-stage test exhibits distorted sizes for detecting many weak instruments, regardless of the choice of pretested estimators or Wald tests. These distortions occur due to the inadequate approximation using classical noncentral Chi-squared distributions. As a byproduct of our main result, we present an alternative approach to pre-test many weak instruments with the corrected first-stage statistic. An empirical illustration with Angrist and Keueger (1991)'s returns to education data confirms its usefulness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMonetary Policy and Economic Impact
MethodsTest
