Information-Preserving Markov Aggregation

Bernhard C. Geiger; Christoph Temmel

arXiv:1304.0920·cs.IT·December 20, 2013·2 cites

Information-Preserving Markov Aggregation

Bernhard C. Geiger, Christoph Temmel

PDF

Open Access

TL;DR

This paper introduces conditions and algorithms for reducing the state space of a Markov chain without losing information, enabling lossless compression while preserving the chain's entropy rate.

Contribution

It provides a sufficient condition for information-preserving state space reduction and an algorithm to find all such reductions for a given Markov chain.

Findings

01

The reduced state space's size is bounded by node degrees of the transition graph.

02

The algorithm can identify all possible lossless reductions.

03

Application demonstrated on an English text bi-gram model.

Abstract

We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain. This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis. The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain. We also present an algorithm listing all possible information-preserving state space reductions, for a given transition graph. We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Bayesian Modeling and Causal Inference · DNA and Biological Computing