An information criterion for model selection with missing data via complete-data divergence
Hidetoshi Shimodaira, Haruyoshi Maeda

TL;DR
This paper introduces a new information criterion for model selection with missing data, which accounts for incomplete observations and provides an asymptotically unbiased estimate of the complete-data divergence.
Contribution
The paper derives a novel information criterion that includes a penalty term for missing data, which is smaller and under weaker assumptions than existing criteria like PDIO and AICcd.
Findings
The new criterion is asymptotically unbiased for complete-data divergence.
It has a smaller penalty term for missing data compared to previous criteria.
Simulation results confirm the unbiasedness of the new criterion.
Abstract
We derive an information criterion to select a parametric model of complete-data distribution when only incomplete or partially observed data is available. Compared with AIC, our new criterion has an additional penalty term for missing data, which is expressed by the Fisher information matrices of complete data and incomplete data. We prove that our criterion is an asymptotically unbiased estimator of complete-data divergence, namely, the expected Kullback-Leibler divergence between the true distribution and the estimated distribution for complete data, whereas AIC is that for the incomplete data. Information criteria PDIO (Shimodaira 1994) and AICcd (Cavanaugh and Shumway 1998) have been previously proposed to estimate complete-data divergence, and they have the same penalty term. The additional penalty term of our criterion for missing data turns out to be only half the value of that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Advanced Statistical Methods and Models
