Subgroup Identification and Interpretation with Bayesian Nonparametric Models in Health Care Claims Data
Christoph Kurz, Laura Hatfield

TL;DR
This paper introduces a Bayesian nonparametric mixture model for analyzing inpatient health care utilization data, effectively identifying patient subgroups with distinct characteristics and improving understanding of spending patterns.
Contribution
It develops a fully Bayesian nonparametric clustering framework that determines the number of subgroups directly from health care claims data, addressing zero inflation and over-dispersion.
Findings
Identified distinct patient subgroups with different hospital stay patterns.
Revealed relationships between covariates and length of stay.
Demonstrated improved modeling of complex health care utilization data.
Abstract
Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding what drives health care spending growth. Common features of inpatient utilization measures include zero inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Mixture modeling is a popular approach that can accommodate these features of health care utilization data. In this work, we add a nonparametric clustering component to such models. Our fully Bayesian model framework allows for an unknown number of mixing components, so that the data determine the number of mixture components. When we apply the modeling framework to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with differences in means and variances of hospital days, health and treatment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Data-Driven Disease Surveillance
