Scalable Maximum Entropy Population Synthesis via Persistent Contrastive Divergence
Mirko Degli Esposti

TL;DR
This paper introduces GibbsPCDSolver, a scalable stochastic method for maximum entropy population synthesis that efficiently handles high-dimensional data without explicit enumeration, enabling more diverse synthetic populations.
Contribution
The paper presents GibbsPCDSolver, a novel PCD-based approach that scales maximum entropy population synthesis to high-dimensional data with improved diversity and efficiency.
Findings
Maintains low mean relative error across increasing attribute dimensions.
Achieves 86.8 times greater diversity than generalized raking.
Runtime scales linearly with the number of attributes, not the full tuple space.
Abstract
Maximum entropy (MaxEnt) modelling provides a principled framework for generating synthetic populations from aggregate census data, without access to individual-level microdata. The bottleneck of exact-enumeration approaches is expectation computation by explicit summation over the full tuple space , which becomes infeasible for more than categorical attributes; sampling-based alternatives exist but rely on Metropolis-type schemes that require proposal tuning and rejection steps. We propose \emph{GibbsPCDSolver}, a stochastic replacement for this computation based on Persistent Contrastive Divergence (PCD): a persistent pool of synthetic individuals is updated by Gibbs sweeps at each gradient step, providing a stochastic approximation of the model expectations without ever materialising . We validate the approach on controlled benchmarks and on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
