An Algorithm for Pattern Discovery in Time Series
Cosma Rohilla Shalizi, Kristina Lisa Shalizi, and James P. Crutchfield

TL;DR
This paper introduces a novel algorithm for discovering causal states in time series data, which infers minimal hidden Markov models directly from data without prior assumptions about their structure, ensuring optimal predictive properties.
Contribution
The paper presents a new data-driven algorithm for inferring minimal, causally optimal hidden Markov models from time series data, without predefined model complexity.
Findings
Algorithm reliably infers causal states from data.
It outperforms conventional HMM fitting methods.
Theoretical guarantees of asymptotic reliability.
Abstract
We present a new algorithm for discovering patterns in time series and other sequential data. We exhibit a reliable procedure for building the minimal set of hidden, Markovian states that is statistically capable of producing the behavior exhibited in the data -- the underlying process's causal states. Unlike conventional methods for fitting hidden Markov models (HMMs) to data, our algorithm makes no assumptions about the process's causal architecture (the number of hidden states and their transition structure), but rather infers it from the data. It starts with assumptions of minimal structure and introduces complexity only when the data demand it. Moreover, the causal states it infers have important predictive optimality properties that conventional HMM states lack. We introduce the algorithm, review the theory behind it, prove its asymptotic reliability, use large deviation theory to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Cellular Automata and Applications · Blind Source Separation Techniques
