Clustering blood donors via mixtures of product partition models with covariates
Raffaele Argiento, Riccardo Corradin, Alessandra Guglielmi, Ettore, Lanzarone

TL;DR
This paper introduces a Bayesian nonparametric clustering model that incorporates covariate information to improve prediction of blood donation gap times and interpret clusters based on personal characteristics.
Contribution
It generalizes existing PPMx models by integrating covariates into the prior, enhancing predictive accuracy and interpretability in blood donation data.
Findings
Covariate-informed models outperform baseline in predictive tasks.
Clusters are interpretable through covariate analysis.
Model effectively predicts recurrence times in blood donors.
Abstract
Motivated by the problem of accurately predicting gap times between successive blood donations, we present here a general class of Bayesian nonparametric models for clustering. These models allow for prediction of new recurrences, accommodating covariate information that describes the personal characteristics of the sample individuals. We introduce a prior for the random partition of the sample individuals which encourages two individuals to be co-clustered if they have similar covariate values. Our prior generalizes PPMx models in the literature, which are defined in terms of cohesion and similarity functions. We assume cohesion functions which yield mixtures of PPMx models, while our similarity functions represent the compactness of a cluster. We show that including covariate information in the prior specification improves the posterior predictive performance and helps interpret the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Blood donation and transfusion practices · Census and Population Estimation
