$K$-means clustering for sparsely observed longitudinal data

Michio Yamamoto; Yoshikazu Terada

arXiv:2411.08256·stat.ME·November 14, 2024

$K$-means clustering for sparsely observed longitudinal data

Michio Yamamoto, Yoshikazu Terada

PDF

Open Access

TL;DR

This paper introduces a fast, simple $k$-means based clustering method tailored for sparsely observed longitudinal data, utilizing basis functions to estimate cluster centers effectively and efficiently.

Contribution

It generalizes classical $k$-means to handle sparse longitudinal data, providing a statistically consistent and computationally efficient clustering approach.

Findings

01

Performs competitively with existing methods

02

Outperforms in computational efficiency

03

Effectively identifies clusters in real-world sparse data

Abstract

In longitudinal data analysis, observation points of repeated measurements over time often vary among subjects except in well-designed experimental studies. Additionally, measurements for each subject are typically obtained at only a few time points. From such sparsely observed data, identifying underlying cluster structures can be challenging. This paper proposes a fast and simple clustering method that generalizes the classical $k$ -means method to identify cluster centers in sparsely observed data. The proposed method employs the basis function expansion to model the cluster centers, providing an effective way to estimate cluster centers from fragmented data. We establish the statistical consistency of the proposed method, as with the classical $k$ -means method. Through numerical experiments, we demonstrate that the proposed method performs competitively with, or even outperforms,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Advanced Clustering Algorithms Research · Face and Expression Recognition