Weighted Mean Difference Statistics for Paired Data in Presence of Missing Values
Yuntong Li, Brent J. Shelton, William St Clair, Heidi L. Weiss, John, L. Villano, Arnold J. Stromberg, Chi Wang, Li Chen

TL;DR
This paper introduces a new class of non-parametric test statistics for comparing paired data with missing values, optimizing weights to improve performance over existing methods, demonstrated through simulations and cancer biomarker studies.
Contribution
It proposes a novel weighted mean difference test for partially paired data that does not rely on distributional assumptions and includes an optimal weight derivation.
Findings
Proposed test outperforms existing methods in simulations.
Optimal weights improve test accuracy and power.
Validated with two cancer biomarker studies.
Abstract
Missing data is a common issue in many biomedical studies. Under a paired design, some subjects may have missing values in either one or both of the conditions due to loss of follow-up, insufficient biological samples, etc. Such partially paired data complicate statistical comparison of the distribution of the variable of interest between the two conditions. In this paper, we propose a general class of test statistics based on the difference in weighted sample means without imposing any distributional or model assumption. An optimal weight is derived for this class of tests. Simulation studies show that our proposed test with the optimal weight performs well and outperforms existing methods in practical situations. Two cancer biomarker studies are provided for illustration.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
