Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records
Wenrui Li, Xiaoyu Wang, Yuetian Sun, Snezana Milanovic, Mark Kon,, Julio Enrique Castrillon-Candas

TL;DR
This paper introduces a multi-level stochastic optimization method for imputing missing data in massive medical datasets, demonstrating superior accuracy, speed, and robustness over existing techniques, including deep learning approaches.
Contribution
The paper presents a novel multi-level stochastic optimization framework for data imputation, offering exact formulations and improved performance for large-scale medical datasets.
Findings
Up to 75% reduction in imputation error.
Outperforms current methods in accuracy and robustness.
Applicable to massive datasets with high numerical stability.
Abstract
It has long been a recognized problem that many datasets contain significant levels of missing numerical data. A potentially critical predicate for application of machine learning methods to datasets involves addressing this problem. However, this is a challenging task. In this paper, we apply a recently developed multi-level stochastic optimization approach to the problem of imputation in massive medical records. The approach is based on computational applied mathematics techniques and is highly accurate. In particular, for the Best Linear Unbiased Predictor (BLUP) this multi-level formulation is exact, and is significantly faster and more numerically stable. This permits practical application of Kriging methods to data imputation problems for massive datasets. We test this approach on data from the National Inpatient Sample (NIS) data records, Healthcare Cost and Utilization Project…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Health Systems, Economic Evaluations, Quality of Life
MethodsTest
