Score: 0

Topic-informed dynamic mixture model for occupational heterogeneity in health risk behaviors

Published: December 23, 2025 | arXiv ID: 2512.20408v1

By: Lorenzo Schiavon , Mattia Stival , Angela Andreella and more

Behavioral risk factors, i.e., smoking, poor nutrition, alcohol misuse, and physical inactivity (SNAP), are leading contributors to chronic diseases and healthcare costs worldwide. Their prevalence is shaped %not only by demographic characteristics %but and also by contextual ones such as socioeconomic and occupational environments. In this study, we leverage data from the Italian health and behavioral surveillance system PASSI to model SNAP behaviors through a Bayesian framework that integrates textual information on occupations. We use Structural Topic Modeling (STM) to cluster free-text job descriptions into latent occupational groups, which inform mixture weights in a multivariate ordered probit model. Covariate effects are allowed to vary across occupational clusters and evolve over time. To enhance interpretability and variable selection, we impose non-local spike-and-slab priors on regression coefficients. Finally, an online learning algorithm based on sequential Monte Carlo enables efficient updating as new data become available. This dynamic, scalable, and interpretable approach permits observing how occupational contexts modulate the impact of socio-demographic factors on health behaviors, providing valuable insights for targeted public health interventions.

Category
Statistics:
Applications