Efficiently Discovering Frequent Motifs in Large-scale Sensor Data
Puneet Agarwal, Gautam Shroff, Sarmimala Saikia, and Zaigham Khan

TL;DR
This paper introduces COIN clustering, an efficient method for discovering high-quality, frequent motifs in large-scale sensor data, overcoming previous efficiency and quality limitations of existing techniques.
Contribution
The paper presents COIN clustering, a novel bounded spherical clustering approach that efficiently finds high-quality time-series motifs in voluminous data sets.
Findings
COIN clustering achieves near-linear complexity in data size.
The method effectively removes trivial and shifted motifs.
It successfully discovers motifs in large vehicular sensor data.
Abstract
While analyzing vehicular sensor data, we found that frequently occurring waveforms could serve as features for further analysis, such as rule mining, classification, and anomaly detection. The discovery of waveform patterns, also known as time-series motifs, has been studied extensively; however, available techniques for discovering frequently occurring time-series motifs were found lacking in either efficiency or quality: Standard subsequence clustering results in poor quality, to the extent that it has even been termed 'meaningless'. Variants of hierarchical clustering using techniques for efficient discovery of 'exact pair motifs' find high-quality frequent motifs, but at the cost of high computational complexity, making such techniques unusable for our voluminous vehicular sensor data. We show that good quality frequent motifs can be discovered using bounded spherical clustering of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications · Data Management and Algorithms
