Enhancing Bagging Ensemble Regression with Data Integration for Time Series-Based Diabetes Prediction
Vuong M. Ngo, Tran Quang Vinh, Patricia Kearney, Mark Roantree

TL;DR
This paper presents an enhanced bagging ensemble regression model that integrates multiple diabetes-related datasets to improve the accuracy of time series-based diabetes prevalence predictions across U.S. cities.
Contribution
The study introduces EBMBag+, a novel ensemble regression method that leverages data integration and advanced ensemble techniques for improved diabetes prediction accuracy.
Findings
EBMBag+ outperformed baseline models in key metrics.
Achieved an MAE of 0.41 and R2 of 0.9.
Demonstrated the effectiveness of data integration in health forecasting.
Abstract
Diabetes is a chronic metabolic disease characterized by elevated blood glucose levels, leading to complications like heart disease, kidney failure, and nerve damage. Accurate state-level predictions are vital for effective healthcare planning and targeted interventions, but in many cases, data for necessary analyses are incomplete. This study begins with a data engineering process to integrate diabetes-related datasets from 2011 to 2021 to create a comprehensive feature set. We then introduce an enhanced bagging ensemble regression model (EBMBag+) for time series forecasting to predict diabetes prevalence across U.S. cities. Several baseline models, including SVMReg, BDTree, LSBoost, NN, LSTM, and ERMBag, were evaluated for comparison with our EBMBag+ algorithm. The experimental results demonstrate that EBMBag+ achieved the best performance, with an MAE of 0.41, RMSE of 0.53, MAPE of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Time Series Analysis and Forecasting
MethodsLong Short-Term Memory · Masked autoencoder
