A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data
Andrea Martino, Andrea Ghiglietti, Francesca Ieva, Anna M. Paganoni

TL;DR
This paper introduces a novel k-means clustering method for multivariate functional data using a generalized Mahalanobis distance, demonstrating superior performance in simulations and real-world applications like ECG and growth curves.
Contribution
It develops a new clustering approach based on a Mahalanobis-type metric for multivariate functions, improving accuracy over traditional distances.
Findings
The method outperforms existing clustering techniques in simulations.
It achieves lower misclassification rates in real data applications.
The generalized Mahalanobis distance effectively captures correlation and variability in functional data.
Abstract
This paper proposes a clustering procedure for samples of multivariate functions in , with . This method is based on a k-means algorithm in which the distance between the curves is measured with a metrics that generalizes the Mahalanobis distance in Hilbert spaces, considering the correlation and the variability along all the components of the functional data. The proposed procedure has been studied in simulation and compared with the k-means based on other distances typically adopted for clustering multivariate functional data. In these simulations, it is shown that the k-means algorithm with the generalized Mahalanobis distance provides the best clustering performances, both in terms of mean and standard deviation of the number of misclassified curves. Finally, the proposed method has been applied to two real cases studies, concerning ECG signals and growth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Data Management and Algorithms · Sensory Analysis and Statistical Methods
