Statistical Jump Model for Mixed-Type Data with Missing Data Imputation
Federico P. Cortese, Antonio Pievatolo

TL;DR
This paper introduces a statistical jump model tailored for mixed-type data with temporal dynamics, effectively handling missing data and improving interpretability for clustering tasks, demonstrated through simulations and air quality data analysis.
Contribution
The paper presents a novel regime-based clustering model for mixed-type temporal data that incorporates missing data handling and interpretability enhancements.
Findings
Outperforms traditional methods in inferring persistent regimes.
Effectively manages missing data in complex datasets.
Provides practical insights for environmental monitoring.
Abstract
In this paper, we address the challenge of clustering mixed-type data with temporal evolution by introducing the statistical jump model for mixed-type data. This novel framework incorporates regime persistence, enhancing interpretability and reducing the frequency of state switches, and efficiently handles missing data. The model is easily interpretable through its state-conditional means and modes, making it accessible to practitioners and policymakers. We validate our approach through extensive simulation studies and an empirical application to air quality data, demonstrating its superiority in inferring persistent air quality regimes compared to the traditional air quality index. Our contributions include a robust method for mixed-type temporal clustering, effective missing data management, and practical insights for environmental monitoring.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
