Error-bounded Approximate Time Series Joins Using Compact Dictionary Representations of Time Series
Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen,, Zhongfang Zhuang, Wei Zhang, Eamonn Keogh

TL;DR
This paper introduces a novel dictionary-based method for efficient approximate inter-time series similarity joins with error guarantees, significantly accelerating anomaly detection while maintaining accuracy.
Contribution
It presents a new compact dictionary representation enabling fast, error-bounded approximate inter-time series joins, extending prior work focused mainly on self-joins.
Findings
Achieves at least 20X throughput improvement in anomaly mining systems.
Maintains essentially no decrease in accuracy with the new method.
Demonstrates utility across diverse domains like medicine and transportation.
Abstract
The matrix profile is an effective data mining tool that provides similarity join functionality for time series data. Users of the matrix profile can either join a time series with itself using intra-similarity join (i.e., self-join) or join a time series with another time series using inter-similarity join. By invoking either or both types of joins, the matrix profile can help users discover both conserved and anomalous structures in the data. Since the introduction of the matrix profile five years ago, multiple efforts have been made to speed up the computation with approximate joins; however, the majority of these efforts only focus on self-joins. In this work, we show that it is possible to efficiently perform approximate inter-time series similarity joins with error bounded guarantees by creating a compact "dictionary" representation of time series. Using the dictionary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Complex Systems and Time Series Analysis · Advanced Text Analysis Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
