On Clustering Time Series Using Euclidean Distance and Pearson   Correlation

Michael R. Berthold; Frank H\"oppner

arXiv:1601.02213·cs.LG·January 12, 2016·29 cites

On Clustering Time Series Using Euclidean Distance and Pearson Correlation

Michael R. Berthold, Frank H\"oppner

PDF

Open Access

TL;DR

This paper reveals that z-score normalized Euclidean distance is mathematically equivalent to Pearson correlation distance for time series, impacting clustering methods like k-Means and providing theoretical insights and experimental validation.

Contribution

It establishes the equivalence between normalized Euclidean distance and Pearson correlation distance, and discusses necessary modifications to k-Means for proper correlation-based clustering.

Findings

01

Normalized Euclidean distance equals Pearson correlation distance.

02

Standard k-Means often yields similar clustering results.

03

Theoretical and experimental validation of the equivalence.

Abstract

For time series comparisons, it has often been observed that z-score normalized Euclidean distances far outperform the unnormalized variant. In this paper we show that a z-score normalized, squared Euclidean Distance is, in fact, equal to a distance based on Pearson Correlation. This has profound impact on many distance-based classification or clustering methods. In addition to this theoretically sound result we also show that the often used k-Means algorithm formally needs a mod ification to keep the interpretation as Pearson correlation strictly valid. Experimental results demonstrate that in many cases the standard k-Means algorithm generally produces the same results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Anomaly Detection Techniques and Applications