Missing data imputation for noisy time-series data and applications in healthcare
Lien P. Le, Xuan-Hien Nguyen Thi, Thu Nguyen, Michael A. Riegler,, P{\aa}l Halvorsen, Binh T. Nguyen

TL;DR
This paper compares various imputation methods for noisy, missing healthcare time-series data, demonstrating that traditional methods like MICE-RF outperform deep learning approaches in accuracy and can also provide denoising benefits.
Contribution
It provides a comprehensive comparison of imputation techniques for healthcare time series, highlighting the effectiveness of MICE-RF over deep learning methods in noisy, missing data scenarios.
Findings
MICE-RF outperforms deep learning methods in imputation accuracy.
Imputation methods can also serve as denoising tools.
Effective imputation improves downstream classification performance.
Abstract
Healthcare time series data is vital for monitoring patient activity but often contains noise and missing values due to various reasons such as sensor errors or data interruptions. Imputation, i.e., filling in the missing values, is a common way to deal with this issue. In this study, we compare imputation methods, including Multiple Imputation with Random Forest (MICE-RF) and advanced deep learning approaches (SAITS, BRITS, Transformer) for noisy, missing time series data in terms of MAE, F1-score, AUC, and MCC, across missing data rates (10 % - 80 %). Our results show that MICE-RF can effectively impute missing data compared to deep learning methods and the improvement in classification of data imputed indicates that imputation can have denoising effects. Therefore, using an imputation algorithm on time series with missing data can, at the same time, offer denoising effects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Statistical Methods and Inference · Mental Health Research Topics
MethodsMasked autoencoder
