Enhancing COVID-19 Screening Models With Epidemiological and Mobility Features: Machine-Learning Model Study
Hyunwoo Choo, Dohyung Lee, Soo-Yong Shin, Jiwoo Lee, Duhun Lee, Eonji Kim, Namsoo Oh, Christina Kim, Myeongchan Kim, Hyo Jung Kim

TL;DR
This study shows that adding mobility and epidemic data to machine learning models improves accuracy in predicting COVID-19 infections.
Contribution
The novel use of mobility and epidemic data alongside patient symptoms enhances ML model performance for COVID-19 screening.
Findings
Combining mobility and epidemic data with symptoms improved ML model performance for diagnosing COVID-19.
The highest model accuracy increased from 0.8712 to 0.9104 with the inclusion of mobility and epidemic data.
External contextual data significantly enhance the accuracy of ML-based screening models.
Abstract
Despite the significant post–COVID-19 pandemic surge in research using symptom data and machine learning (ML) for patient screening, data on patient trajectories and epidemiological conditions, although crucial, have remained underused. This study aimed to enhance the performance of ML models for COVID-19 screening by incorporating mobility and epidemic information in addition to patient symptom data. Data, including daily self-reported symptoms, location information, and test results, were collected from 48,798 individuals using a smartphone app. These data were then combined with Our World in Data and national government epidemic information to train 5 ML-based screening models to classify patient infection status. The models were logistic regression, extreme gradient boosting, light gradient boosting machine, tabular data network, and Google AutoML. The addition of mobility and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Data-Driven Disease Surveillance · Machine Learning in Healthcare
