Distribution Agnostic Symbolic Representations for Time Series   Dimensionality Reduction and Online Anomaly Detection

Konstantinos Bountrogiannis; George Tzagkarakis; Panagiotis Tsakalides

arXiv:2105.09592·cs.IR·April 24, 2024

Distribution Agnostic Symbolic Representations for Time Series Dimensionality Reduction and Online Anomaly Detection

Konstantinos Bountrogiannis, George Tzagkarakis, Panagiotis Tsakalides

PDF

2 Repos

TL;DR

This paper introduces two new data-driven symbolic representations for time series that improve upon traditional SAX by reducing information loss and enhancing anomaly detection, verified through theoretical analysis and experiments.

Contribution

It proposes two novel SAX-based symbolic representations using kernel density estimation with Lloyd-Max quantization and Mean-Shift clustering, addressing limitations of Gaussian assumptions.

Findings

01

Outperforms traditional SAX in real-world datasets

02

Enhances anomaly detection capabilities

03

Reduces information loss compared to existing methods

Abstract

Due to the importance of the lower bounding distances and the attractiveness of symbolic representations, the family of symbolic aggregate approximations (SAX) has been used extensively for encoding time series data. However, typical SAX-based methods rely on two restrictive assumptions; the Gaussian distribution and equiprobable symbols. This paper proposes two novel data-driven SAX-based symbolic representations, distinguished by their discretization steps. The first representation, oriented for general data compaction and indexing scenarios, is based on the combination of kernel density estimation and Lloyd-Max quantization to minimize the information loss and mean squared error in the discretization step. The second method, oriented for high-level mining tasks, employs the Mean-Shift clustering method and is shown to enhance anomaly detection in the lower-dimensional space. Besides,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.