Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data
Jushan Bai, Serena Ng

TL;DR
This paper introduces a new factor-based imputation method for missing data in panels, enabling consistent estimation of the common component and counterfactual analysis without regularization or iteration.
Contribution
It presents a novel imputation procedure leveraging factor structures in both tall and wide data blocks, with theoretical guarantees for consistency and convergence rates.
Findings
Consistent estimation of the common component at four different convergence rates.
Effective counterfactual estimation under a factor structure for potential outcomes.
Normal distribution theory for treatment effect estimation and hypothesis testing.
Abstract
This paper proposes an imputation procedure that uses the factors estimated from a tall block along with the re-rotated loadings estimated from a wide block to impute missing values in a panel of data. Assuming that a strong factor structure holds for the full panel of data and its sub-blocks, it is shown that the common component can be consistently estimated at four different rates of convergence without requiring regularization or iteration. An asymptotic analysis of the estimation error is obtained. An application of our analysis is estimation of counterfactuals when potential outcomes have a factor structure. We study the estimation of average and individual treatment effects on the treated and establish a normal distribution theory that can be useful for hypothesis testing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsCounterfactuals Explanations
