Missing data and bias in physics education research: A case for using multiple imputation
Jayson Nissen, Robin Donatello, Ben Van Dusen

TL;DR
This paper highlights the bias introduced by complete-case analysis in physics education research due to missing data and advocates for using multiple imputation to improve accuracy and reduce bias.
Contribution
The study demonstrates that multiple imputation outperforms complete-case analysis in reducing bias in PER studies with missing data, advocating for its adoption.
Findings
Complete-case analysis introduces significant bias.
Multiple imputation provides more accurate results.
Using MI can reveal true differences between student groups.
Abstract
Physics education researchers (PER) commonly use complete-case analysis to address missing data. For complete-case analysis, researchers discard all data from any student who is missing any data. Despite its frequent use, no PER article we reviewed that used complete-case analysis provided evidence that the data met the assumption of missing completely at random (MCAR) necessary to ensure accurate results. Not meeting this assumption raises the possibility that prior studies have reported biased results with inflated gains that may obscure differences across courses. To test this possibility, we compared the accuracy of complete-case analysis and multiple imputation (MI) using simulated data. We simulated the data based on prior studies such that students who earned higher grades participated at higher rates, which made the data missing at random (MAR). PER studies seldom use MI, but MI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
