Infinite Hidden Markov Models for Multiple Multivariate Time Series with Missing Data
Lauren Hoskovec, Matthew D. Koslovsky, Kirsten Koehler, Nicholas Good,, Jennifer L. Peel, John Volckens, and Ander Wilson

TL;DR
This paper introduces an infinite hidden Markov model tailored for multiple multivariate time series with missing data, enhancing state estimation and data imputation in air pollution exposure studies.
Contribution
It develops a novel Bayesian infinite hidden Markov model incorporating covariates, with specialized sampling and imputation algorithms for incomplete multivariate time series.
Findings
Model accurately estimates hidden states and means.
Improves imputation for data missing at random or below detection limit.
Identifies shared activity and exposure patterns across individuals.
Abstract
Exposure to air pollution is associated with increased morbidity and mortality. Recent technological advancements permit the collection of time-resolved personal exposure data. Such data are often incomplete with missing observations and exposures below the limit of detection, which limit their use in health effects studies. In this paper we develop an infinite hidden Markov model for multiple asynchronous multivariate time series with missing data. Our model is designed to include covariates that can inform transitions among hidden states. We implement beam sampling, a combination of slice sampling and dynamic programming, to sample the hidden states, and a Bayesian multiple imputation algorithm to impute missing data. In simulation studies, our model excels in estimating hidden states and state-specific means and imputing observations that are missing at random or below the limit of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Air Quality and Health Impacts · Data-Driven Disease Surveillance
