On Determining and Qualifying the Number of Superstates in Aggregation of Markov Chains
Amber Srivastava, Raj K. Velicheti, and Srinivasa M. Salapaka

TL;DR
This paper introduces a structured methodology for determining the optimal number of superstates in Markov chain aggregation by comparing marginal returns, justified by the Maximum Entropy Principle, and validated through synthetic and real-world simulations.
Contribution
The paper proposes a novel, principled approach to select the number of superstates in Markov chain aggregation using marginal return comparison and Maximum Entropy justification.
Findings
Largest marginal return identifies true number of superstates in synthetic chains
Method reveals inherent structure in real-world Markov models
Approach outperforms heuristic methods in aggregation quality
Abstract
Many studies involving large Markov chains require determining a smaller representative (aggregated) chains. Each {\em superstate} in the representative chain represents a {\em group of related} states in the original Markov chain. Typically, the choice of number of superstates in the aggregated chain is ambiguous, and based on the limited prior know-how. In this paper we present a structured methodology of determining the best candidate for the number of superstates. We achieve this by comparing aggregated chains of different sizes. To facilitate this comparison we develop and quantify a notion of {\em marginal return}. Our notion captures the decrease in the {\em heterogeneity} within the group of the {\em related} states (i.e., states represented by the same superstate) upon a unit increase in the number of superstates in the aggregated chain. We use Maximum Entropy Principle to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Bayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models
