A Deep Generative Framework for Joint Households and Individuals Population Synthesis
Xiao Qian, Utkarsh Gangwal, Shangjia Dong, Rachel Davidson

TL;DR
This paper introduces a deep generative VAE-based framework for creating realistic, geographically accurate synthetic populations that preserve complex sociodemographic relationships and align with census data.
Contribution
It presents a novel data structure, transfer learning process, and loss function to generate synthetic populations with preserved correlations and geographic distribution.
Findings
Successfully generated realistic household and individual records in Delaware.
Accurately matched census tract population statistics.
Demonstrated transferability in North Carolina.
Abstract
Household and individual-level sociodemographic data are essential for understanding human-infrastructure interaction and policymaking. However, the Public Use Microdata Sample (PUMS) offers only a sample at the state level, while census tract data only provides the marginal distributions of variables without correlations. Therefore, we need an accurate synthetic population dataset that maintains consistent variable correlations observed in microdata, preserves household-individual and individual-individual relationships, adheres to state-level statistics, and accurately represents the geographic distribution of the population. We propose a deep generative framework leveraging the variational autoencoder (VAE) to generate a synthetic population with the aforementioned features. The methodological contributions include (1) a new data structure for capturing household-individual and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsurance, Mortality, Demography, Risk Management
MethodsALIGN
