Estimation of Semiparametric Factor Models with Missing Data
Sijie Zheng

TL;DR
This paper introduces a projected principal component analysis method for semiparametric factor models that effectively handles missing data, ensuring consistent estimation and robust inference in high-dimensional panel settings.
Contribution
It develops a novel PPCA approach with inverse-probability weighting for missing data, providing theoretical guarantees and practical robustness for factor and loading estimation.
Findings
PPCA achieves consistent factor estimation with fixed T.
The method accounts for missing-at-random data mechanisms.
Simulations and empirical analysis validate the approach.
Abstract
We study semiparametric factor models in high-dimensional panels where the factor loadings consist of a nonparametric component explained by observed covariates and an idiosyncratic component capturing unobserved heterogeneity. A key challenge in empirical applications is the presence of missing observations, which can distort both factor recovery and loading estimation. To address this issue, we develop a projected principal component analysis (PPCA) procedure that accommodates general missing-at-random mechanisms through inverse-probability weighting. We establish consistency and derive the asymptotic distributions of the estimated factors and loading functions, allowing the sieve dimension to diverge and permitting the time dimension to be either fixed or growing. Unlike classical PCA, PPCA achieves consistent factor estimation even when T is fixed, and the limiting distributions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Psychometric Methodologies and Testing · Bayesian Methods and Mixture Models
