Towards Symbolic Time Series Representation Improved by Kernel Density Estimators
Matej Kloska, Viera Rozinajova

TL;DR
This paper introduces edwSAX, an improved symbolic time series representation method that enhances the original SAX algorithm by better handling non-Gaussian data distributions and improving information coverage.
Contribution
The paper presents edwSAX, a novel extension of SAX that optimally covers the information space and satisfies the lower bounding criterion more tightly.
Findings
edwSAX outperforms SAX in time series reconstruction error
edwSAX provides tighter Euclidean distance lower bounds
The method effectively handles non-Gaussian data distributions
Abstract
This paper deals with symbolic time series representation. It builds up on the popular mapping technique Symbolic Aggregate approXimation algorithm (SAX), which is extensively utilized in sequence classification, pattern mining, anomaly detection, time series indexing and other data mining tasks. However, the disadvantage of this method is, that it works reliably only for time series with Gaussian-like distribution. In our previous work we have proposed an improvement of SAX, called dwSAX, which can deal with Gaussian as well as non-Gaussian data distribution. Recently we have made further progress in our solution - edwSAX. Our goal was to optimally cover the information space by means of sufficient alphabet utilization; and to satisfy lower bounding criterion as tight as possible. We describe here our approach, including evaluation on commonly employed tasks such as time series…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
