# Selection of the Number of Clusters in Functional Data Analysis

**Authors:** Adriano Zanin Zambom, Julian A. Collazos, Ronaldo Dias

arXiv: 1905.00977 · 2019-05-06

## TL;DR

This paper introduces a new measure combining two test statistics to improve the estimation of the number of clusters in functional data analysis, outperforming existing methods in simulations and real data applications.

## Contribution

A novel measure combining lack of parallelism and mean distance to better estimate the optimal number of clusters in functional data analysis.

## Key findings

- Proposed method detects the correct number of clusters more frequently than existing methods.
- Simulation results demonstrate improved accuracy in challenging scenarios.
- Application to real datasets confirms practical effectiveness.

## Abstract

Identifying the number $K$ of clusters in a dataset is one of the most difficult problems in clustering analysis. A choice of $K$ that correctly characterizes the features of the data is essential for building meaningful clusters. In this paper we tackle the problem of estimating the number of clusters in functional data analysis by introducing a new measure that can be used with different procedures in selecting the optimal $K$. The main idea is to use a combination of two test statistics, which measure the lack of parallelism and the mean distance between curves, to compute criteria such as the within and between cluster sum of squares. Simulations in challenging scenarios suggest that procedures using this measure can detect the correct number of clusters more frequently than existing methods in the literature. The application of the proposed method is illustrated on several real datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.00977/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1905.00977/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/1905.00977/full.md

---
Source: https://tomesphere.com/paper/1905.00977