On the Relation of State Space Models and Hidden Markov Models
Aydin Ghojogh, M.Hadi Sepanj, Benyamin Ghojogh

TL;DR
This paper systematically compares classical probabilistic state space models and hidden Markov models with modern neural sequence models, clarifying their relationships, differences, and implications across control, probabilistic modeling, and deep learning.
Contribution
It provides a unified analysis of HMMs, linear Gaussian SSMs, Kalman filtering, and modern NLP state space models, highlighting their structural and semantic connections and differences.
Findings
Clarifies when models are equivalent or diverge
Analyzes inference algorithms like forward-backward and Kalman filtering
Links classical models with modern neural sequence architectures
Abstract
State Space Models (SSMs) and Hidden Markov Models (HMMs) are foundational frameworks for modeling sequential data with latent variables and are widely used in signal processing, control theory, and machine learning. Despite their shared temporal structure, they differ fundamentally in the nature of their latent states, probabilistic assumptions, inference procedures, and training paradigms. Recently, deterministic state space models have re-emerged in natural language processing through architectures such as S4 and Mamba, raising new questions about the relationship between classical probabilistic SSMs, HMMs, and modern neural sequence models. In this paper, we present a unified and systematic comparison of HMMs, linear Gaussian state space models, Kalman filtering, and contemporary NLP state space models. We analyze their formulations through the lens of probabilistic graphical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
