The Bias and Efficiency of Incomplete-Data Estimators in Small   Univariate Normal Samples

Paul T. von Hippel

arXiv:1204.3132·math.ST·March 27, 2017

The Bias and Efficiency of Incomplete-Data Estimators in Small Univariate Normal Samples

Paul T. von Hippel

PDF

TL;DR

This paper compares the bias and efficiency of observed-data maximum likelihood and multiple imputation methods in small univariate normal samples with missing data, finding ML to be more accurate and less biased.

Contribution

It provides a detailed evaluation of biases in small-sample missing data analysis, highlighting the superior performance of ML over MI methods.

Findings

01

ML is more efficient than MI in small samples.

02

ML imputation has less bias than PD imputation.

03

Bias and efficiency of PD imputation can be improved with a different prior.

Abstract

Widely used methods for analyzing missing data can be biased in small samples. To understand these biases, we evaluate in detail the situation where a small univariate normal sample, with values missing at random, is analyzed using either observed-data maximum likelihood (ML) or multiple imputation (MI). We evaluate two types of MI: the usual Bayesian approach, which we call posterior draw (PD) imputation, and a little-used alternative, which we call ML imputation, in which values are imputed conditionally on an ML estimate. We find that observed-data ML is more efficient and has lower mean squared error than either type of MI. Between the two types of MI, ML imputation is more efficient than PD imputation, and ML imputation also has less potential for bias in small samples. The bias and efficiency of PD imputation can be improved by a change of prior.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.