Markovian embeddings of general random strings
Manuel Lladser

TL;DR
This paper characterizes how to embed non-Markovian sequences into Markov chains using transformations, enabling analysis of pattern occurrences with asymptotic distributions.
Contribution
It introduces a unique coarsest refinement of embeddings that makes any sequence Markovian and proposes a specific embedding for pattern analysis.
Findings
Established asymptotic distributions for pattern counts in non-Markovian sequences.
Proposed a Markovian embedding suitable for pattern occurrence analysis.
Analyzed a toy example demonstrating the embedding's effectiveness.
Abstract
Let A be a finite set and X a sequence of A-valued random variables. We do not assume any particular correlation structure between these random variables; in particular, X may be a non-Markovian sequence. An adapted embedding of X is a sequence of the form R(X_1), R(X_1,X_2), R(X_1,X_2,X_3), etc where R is a transformation defined over finite length sequences. In this extended abstract we characterize a wide class of adapted embeddings of X that result in a first-order homogeneous Markov chain. We show that any transformation R has a unique coarsest refinement R' in this class such that R'(X_1), R'(X_1,X_2), R'(X_1,X_2,X_3), etc is Markovian. (By refinement we mean that R'(u)=R'(v) implies R(u)=R(v), and by coarsest refinement we mean that R' is a deterministic function of any other refinement of R in our class of transformations.) We propose a specific embedding that we denote as R^X…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Bayesian Methods and Mixture Models · Mathematical Dynamics and Fractals
