$K$-means clustering for sparsely observed longitudinal data
Michio Yamamoto, Yoshikazu Terada

TL;DR
This paper introduces a fast, simple $k$-means based clustering method tailored for sparsely observed longitudinal data, utilizing basis functions to estimate cluster centers effectively and efficiently.
Contribution
It generalizes classical $k$-means to handle sparse longitudinal data, providing a statistically consistent and computationally efficient clustering approach.
Findings
Performs competitively with existing methods
Outperforms in computational efficiency
Effectively identifies clusters in real-world sparse data
Abstract
In longitudinal data analysis, observation points of repeated measurements over time often vary among subjects except in well-designed experimental studies. Additionally, measurements for each subject are typically obtained at only a few time points. From such sparsely observed data, identifying underlying cluster structures can be challenging. This paper proposes a fast and simple clustering method that generalizes the classical -means method to identify cluster centers in sparsely observed data. The proposed method employs the basis function expansion to model the cluster centers, providing an effective way to estimate cluster centers from fragmented data. We establish the statistical consistency of the proposed method, as with the classical -means method. Through numerical experiments, we demonstrate that the proposed method performs competitively with, or even outperforms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Advanced Clustering Algorithms Research · Face and Expression Recognition
