Topic-informed dynamic mixture model for occupational heterogeneity in health risk behaviors
Lorenzo Schiavon, Mattia Stival, Angela Andreella, Stefano Campostrini

TL;DR
This paper introduces a Bayesian dynamic mixture model that combines textual occupational data with health behavior surveys to analyze how occupational contexts influence health risk behaviors over time.
Contribution
It develops a novel integrated framework using Structural Topic Modeling and Bayesian regression with variable selection to study occupational heterogeneity in health behaviors.
Findings
Occupational groups significantly influence health risk behaviors.
The model captures temporal changes in occupational effects.
The approach improves interpretability and scalability for public health analysis.
Abstract
Behavioral risk factors, i.e., smoking, poor nutrition, alcohol misuse, and physical inactivity (SNAP), are leading contributors to chronic diseases and healthcare costs worldwide. Their prevalence is shaped %not only by demographic characteristics %but and also by contextual ones such as socioeconomic and occupational environments. In this study, we leverage data from the Italian health and behavioral surveillance system PASSI to model SNAP behaviors through a Bayesian framework that integrates textual information on occupations. We use Structural Topic Modeling (STM) to cluster free-text job descriptions into latent occupational groups, which inform mixture weights in a multivariate ordered probit model. Covariate effects are allowed to vary across occupational clusters and evolve over time. To enhance interpretability and variable selection, we impose non-local spike-and-slab priors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Data-Driven Disease Surveillance · Computational and Text Analysis Methods
