Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data
Youmi Suk, Chenguang Pan, Weixuan Xiao

TL;DR
This paper introduces a comprehensive AI-driven framework for generating synthetic multilevel data to improve the evaluation of quantitative methods via Monte Carlo simulations, emphasizing fidelity and robustness.
Contribution
It presents a six-stage workflow for AI-based simulation studies, including data generation, quality assessment, and tailored evaluation strategies for multilevel data.
Findings
Enhanced data fidelity through targeted modifications to diffusion models and GANs.
Systematic quality evaluation framework for within-table and between-table fidelity.
Demonstrated utility with empirical multilevel data, improving method evaluation accuracy.
Abstract
The role of AI-generated synthetic data has recently been expanded to support realistic Monte Carlo simulations. However, guidance is limited on generating data with multilevel structures and designing simulations based on such data. This study proposes a general framework for AI-based simulation studies to evaluate the predictive performance and parameter recovery of quantitative methods, specifically using multilevel data commonly observed in the social sciences. Our proposed six-stage workflow consists of (i) specifying a method and real data, (ii) training Generative AI with real data, (iii) assessing synthetic data quality, (iv) designing and conducting simulations, (v) evaluating method performance, and (vi) checking robustness. To enhance fidelity in multilevel data generation, we also introduce targeted modifications to diffusion models and Generative Adversarial Networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
