Higher-Order Kullback-Leibler Aggregation of Markov Chains
Bernhard C. Geiger, Yuchen Wu

TL;DR
This paper introduces a method for reducing large first-order Markov chains to smaller higher-order chains using information-theoretic cost functions, improving model simplification in applications like language processing.
Contribution
It proposes new cost functions related to predictability and lumpability for higher-order Markov chain aggregation, along with heuristics for their minimization.
Findings
Higher-order aggregation improves model reduction effectiveness.
Experiments demonstrate benefits in natural language processing.
Method enhances reliability analysis models.
Abstract
We consider the problem of reducing a first-order Markov chain on a large alphabet to a higher-order Markov chain on a small alphabet. We present information-theoretic cost functions that are related to predictability and lumpability, show relations between these cost functions, and discuss heuristics to minimize them. Our experiments suggest that the generalization to higher orders is useful for model reduction in reliability analysis and natural language processing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Text and Document Classification Technologies · Machine Learning and Algorithms
