Inference for Large Panel Data with Many Covariates
Markus Pelger, Jiacheng Zou

TL;DR
This paper introduces Panel-PoSI, a new method for covariate selection and inference in large panel data with many covariates, controlling false discoveries and improving power over existing methods.
Contribution
It develops a novel inference procedure combining post-selection inference with multiple testing adjustments tailored for large panel datasets with many covariates.
Findings
Panel-PoSI controls family-wise error rates effectively.
The method outperforms benchmarks in out-of-sample asset pricing tests.
It identifies a small set of relevant factors explaining investment strategies.
Abstract
This paper proposes a novel testing procedure for selecting a sparse set of covariates that explains a large dimensional panel. Our selection method provides correct false detection control while having higher power than existing approaches. We develop the inferential theory for large panels with many covariates by combining post-selection inference with a novel multiple testing adjustment. Our data-driven hypotheses are conditional on the sparse covariate selection. We control for family-wise error rates for covariate discovery for large cross-sections. As an easy-to-use and practically relevant procedure, we propose Panel-PoSI, which combines the data-driven adjustment for panel multiple testing with valid post-selection p-values of a generalized LASSO, that allows us to incorporate priors. In an empirical study, we select a small number of asset pricing factors that explain a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial and Panel Data Analysis · Economic and Environmental Valuation · Statistical Methods and Inference
