Generalized k-Means in GLMs with Applications to the Outbreak of COVID-19 in the United States
Tonglin Zhang, Ge Lin

TL;DR
This paper introduces a generalized k-means clustering method for statistical models, utilizing likelihood-based dissimilarity measures, and applies it to analyze COVID-19 outbreak patterns across US states.
Contribution
It develops a novel clustering approach for statistical models using likelihood ratio and F-statistics, with automatic selection of the number of clusters via GIC, and demonstrates its effectiveness on COVID-19 data.
Findings
BIC can identify the correct number of clusters, AIC cannot.
The method successfully groups COVID-19 outbreak patterns by state.
Statistical differences between clusters are significant.
Abstract
Generalized -means can be incorporated with any similarity or dissimilarity measure for clustering. By choosing the dissimilarity measure as the well known likelihood ratio or -statistic, this work proposes a method based on generalized -means to group statistical models. Given the number of clusters , the method is established under hypothesis tests between statistical models. If is unknown, then the method can be combined with GIC to automatically select the best for clustering. The article investigates both AIC and BIC as the special cases. Theoretical and simulation results show that the number of clusters can be identified by BIC but not AIC. The resulting method for GLMs is used to group the state-level time series patterns for the outbreak of COVID-19 in the United States. A further study shows that the statistical models between the clusters are significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Advanced Clustering Algorithms Research
