Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences
Cosma Rohilla Shalizi, Kristina Lisa Shalizi

TL;DR
This paper introduces CSSR, an algorithm for constructing optimal nonlinear predictors of discrete sequences using hidden Markov models, demonstrating superior or comparable performance to existing methods.
Contribution
The paper presents a novel algorithm, CSSR, for data-driven construction of optimal nonlinear predictors in the form of hidden Markov models for discrete sequences.
Findings
CSSR outperforms variable-length Markov models in predictions.
CSSR achieves results comparable or superior to cross-validated hidden Markov models.
The method requires minimal structural assumptions and is supported by theoretical and experimental validation.
Abstract
We present a new method for nonlinear prediction of discrete random sequences under minimal structural assumptions. We give a mathematical construction for optimal predictors of such processes, in the form of hidden Markov models. We then describe an algorithm, CSSR (Causal-State Splitting Reconstruction), which approximates the ideal predictor from data. We discuss the reliability of CSSR, its data requirements, and its performance in simulations. Finally, we compare our approach to existing methods using variable-length Markov models and cross-validated hidden Markov models, and show theoretically and experimentally that our method delivers results superior to the former and at least comparable to the latter.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Blind Source Separation Techniques · Neural Networks and Applications
