Constructing synthetic populations in the age of big data
M.A.Nicolaie, Koen Fussenich, Caroline Ameling, Hendriek C.Boshuizen

TL;DR
This paper presents a method for creating synthetic populations by predicting individual features from confidential health and socio-economic data, enabling realistic public health simulations while preserving data privacy.
Contribution
It introduces a novel approach to generate synthetic data for micro-simulations using models trained on confidential population data, addressing privacy concerns.
Findings
Effective synthetic population generation demonstrated
Models accurately predict individual features
Supports realistic public health scenario analysis
Abstract
To develop public health intervention models using microsimulations, extensive personal information about inhabitants is needed, such as socio-demographic, economic and health figures. Data confidentiality is an essential characteristic of such data, while the data should support realistic scenarios. Collection of such data is possible only in secured environments and not directly available for external micro-simulation models. The aim of this paper is to illustrate a method for construction of synthetic data by predicting individual features through models based on confidential data on health and socio-economic determinants of the entire Dutch population.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsdemographic modeling and climate adaptation · Health disparities and outcomes · Chronic Disease Management Strategies
