Hypothesis testing for matched pairs with missing data by maximum mean discrepancy: An application to continuous glucose monitoring
Marcos Matabuena, Paulo F\'elix, Marc Ditzhaus, Juan Vidal and, Francisco Gude

TL;DR
This paper introduces new estimators based on maximum mean discrepancy for hypothesis testing in complex matched pairs with missing data, applicable to biological data like continuous glucose monitoring.
Contribution
It develops novel MMD-based estimators for complex data with missingness, extending univariate methods to graphs, strings, and distributions, with proven validity and consistency.
Findings
New estimators effectively detect distribution differences with missing data.
Simulation studies confirm statistical consistency of the proposed methods.
Application to glucose monitoring data reveals distributional changes over time.
Abstract
A frequent problem in statistical science is how to properly handle missing data in matched paired observations. There is a large body of literature coping with the univariate case. Yet, the ongoing technological progress in measuring biological systems raises the need for addressing more complex data, e.g., graphs, strings and probability distributions, among others. In order to fill this gap, this paper proposes new estimators of the maximum mean discrepancy (MMD) to handle complex matched pairs with missing data. These estimators can detect differences in data distributions under different missingness mechanisms. The validity of this approach is proven and further studied in an extensive simulation study, and results of statistical consistency are provided. Data from continuous glucose monitoring in a longitudinal population-based diabetes study are used to illustrate the application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
