Higher-Order Kullback-Leibler Aggregation of Markov Chains

Bernhard C. Geiger; Yuchen Wu

arXiv:1608.04637·cs.IT·June 13, 2017·2 cites

Higher-Order Kullback-Leibler Aggregation of Markov Chains

Bernhard C. Geiger, Yuchen Wu

PDF

Open Access

TL;DR

This paper introduces a method for reducing large first-order Markov chains to smaller higher-order chains using information-theoretic cost functions, improving model simplification in applications like language processing.

Contribution

It proposes new cost functions related to predictability and lumpability for higher-order Markov chain aggregation, along with heuristics for their minimization.

Findings

01

Higher-order aggregation improves model reduction effectiveness.

02

Experiments demonstrate benefits in natural language processing.

03

Method enhances reliability analysis models.

Abstract

We consider the problem of reducing a first-order Markov chain on a large alphabet to a higher-order Markov chain on a small alphabet. We present information-theoretic cost functions that are related to predictability and lumpability, show relations between these cost functions, and discuss heuristics to minimize them. Our experiments suggest that the generalization to higher orders is useful for model reduction in reliability analysis and natural language processing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Text and Document Classification Technologies · Machine Learning and Algorithms