Prediction of the Number of COVID-19 Confirmed Cases Based on K-Means-LSTM
Shashank Reddy Vadyala, Sai Nethra Betgeri, Eric A. Sherer, Amod, Amritphale

TL;DR
This paper introduces a novel K-Means-LSTM hybrid model that improves short-term COVID-19 case prediction accuracy in Louisiana by combining feature selection, clustering, and deep learning techniques, outperforming traditional SEIR models.
Contribution
The paper presents a new combined algorithm integrating K-Means, Xgboost, and LSTM for more accurate COVID-19 case forecasting, addressing overfitting issues in existing models.
Findings
K-Means-LSTM achieved RMSE of 601.20, significantly lower than SEIR's 3615.83.
The hybrid model effectively captures regional similarities for better predictions.
The approach reduces overfitting and improves short-term forecast accuracy.
Abstract
COVID-19 is a pandemic disease that began to rapidly spread in the US with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then increased rapidly with total cases of 25,739 as of April 20, 2020. The Covid-19 pandemic is so unnerving that it is difficult to understand how any person is affected by the virus. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often "overfit" the data by using too many predictor variables and small sample sizes to create models. Models thus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
