A Latent Source Model for Nonparametric Time Series Classification
George H. Chen, Stanislav Nikolov, Devavrat Shah

TL;DR
This paper introduces a latent source model for nonparametric time series classification, providing theoretical guarantees and demonstrating effectiveness in forecasting trending Twitter topics with early detection capabilities.
Contribution
It proposes a novel latent source model that justifies nearest-neighbor classification for time series and offers performance guarantees under this model.
Findings
Weighted majority voting matches nearest-neighbor accuracy with less data
Achieved 79% early detection of trending topics on Twitter
High true positive rate of 95% with low false positives
Abstract
For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications · Data Visualization and Analytics
