Assessing imputation techniques for missing data in small and multicollinear datasets: insights from craniofacial morphometry
Norli Anida Abdullah, Firdaus Hariri, Mohamad Norikmal Fazli Hisam, Siti Fatimah Binti Hassan

TL;DR
This study compares methods for filling in missing data in small, complex craniofacial datasets and finds that random forest imputation works best.
Contribution
Identifies random forest as the most effective imputation method for small, high-dimensional, and correlated craniofacial datasets.
Findings
Random Forest (RF) imputation had the lowest RMSE and MAE, showing high accuracy in filling missing data.
RF preserved dataset variability better than other methods, with a variance preservation score of 0.8961.
MICE had the closest variance preservation to original data but lower accuracy compared to RF.
Abstract
Analyses of craniofacial morphology are essential for various medical and research applications, including the study of midfacial development, dysmorphologies, and planning surgical interventions. Incomplete CT scans often due to patient movement, imaging artifacts, or obscured landmarks which can result in missing data. If not properly addressed, such missingness may bias conclusions and weaken statistical power. This paper evaluates imputation techniques to identify the most suitable method for handling missing completely at random values in small, high-dimensional, and highly correlated craniofacial morphometric datasets. 42 craniofacial variables were measured from 32 observations. The missing data structure was set to be at random with 268 (20%) missing values. Five common imputation techniques namely Mean/Median imputation, k-Nearest Neighbors (kNN), Multiple Imputation by…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOrthodontics and Dentofacial Orthopedics · Forensic Anthropology and Bioarchaeology Studies · Face recognition and analysis
