A Generic Framework for Efficient and Effective Subsequence Retrieval
Haohan Zhu, George Kollios, Vassilis Athitsos

TL;DR
This paper introduces a unified framework for efficient subsequence matching in time series and string databases, leveraging a new property called 'consistency' and a novel index structure named 'reference net' to improve retrieval performance and scalability.
Contribution
It defines the 'consistency' property for distance functions, proves many popular measures satisfy it, and presents the 'reference net' index for scalable, effective subsequence retrieval.
Findings
Most popular distance functions are consistent.
The reference net scales well in space and time.
The framework improves retrieval performance across measures.
Abstract
This paper proposes a general framework for matching similar subsequences in both time series and string databases. The matching results are pairs of query subsequences and database subsequences. The framework finds all possible pairs of similar subsequences if the distance measure satisfies the "consistency" property, which is a property introduced in this paper. We show that most popular distance functions, such as the Euclidean distance, DTW, ERP, the Frechet distance for time series, and the Hamming distance and Levenshtein distance for strings, are all "consistent". We also propose a generic index structure for metric spaces named "reference net". The reference net occupies O(n) space, where n is the size of the dataset and is optimized to work well with our framework. The experiments demonstrate the ability of our method to improve retrieval performance when combined with diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Management and Algorithms · Metabolomics and Mass Spectrometry Studies
