Oversampling techniques for predicting COVID-19 patient length of stay
Zachariah Farahany, Jiawei Wu, K M Sajjadul Islam, Praveen Madiraju

TL;DR
This study uses oversampling and neural networks to improve prediction of COVID-19 patient length of stay from electronic health records, addressing class imbalance for better severity assessment.
Contribution
It introduces a novel oversampling approach combined with Bayesian-optimized neural networks for predicting COVID-19 severity based on hospital stay length.
Findings
Oversampling improves prediction accuracy for long hospital stays.
Bayesian optimization enhances neural network hyperparameter tuning.
Model achieves higher F1 scores compared to baseline methods.
Abstract
COVID-19 is a respiratory disease that caused a global pandemic in 2019. It is highly infectious and has the following symptoms: fever or chills, cough, shortness of breath, fatigue, muscle or body aches, headache, the new loss of taste or smell, sore throat, congestion or runny nose, nausea or vomiting, and diarrhea. These symptoms vary in severity; some people with many risk factors have been known to have lengthy hospital stays or die from the disease. In this paper, we analyze patients' electronic health records (EHR) to predict the severity of their COVID-19 infection using the length of stay (LOS) as our measurement of severity. This is an imbalanced classification problem, as many people have a shorter LOS rather than a longer one. To combat this problem, we synthetically create alternate oversampled training data sets. Once we have this oversampled data, we run it through an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Machine Learning in Healthcare · Artificial Intelligence in Healthcare
