Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data

Youmi Suk; Chenguang Pan; Weixuan Xiao

arXiv:2605.05752·stat.ME·May 8, 2026

Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data

Youmi Suk, Chenguang Pan, Weixuan Xiao

PDF

TL;DR

This paper introduces a comprehensive AI-driven framework for generating synthetic multilevel data to improve the evaluation of quantitative methods via Monte Carlo simulations, emphasizing fidelity and robustness.

Contribution

It presents a six-stage workflow for AI-based simulation studies, including data generation, quality assessment, and tailored evaluation strategies for multilevel data.

Findings

01

Enhanced data fidelity through targeted modifications to diffusion models and GANs.

02

Systematic quality evaluation framework for within-table and between-table fidelity.

03

Demonstrated utility with empirical multilevel data, improving method evaluation accuracy.

Abstract

The role of AI-generated synthetic data has recently been expanded to support realistic Monte Carlo simulations. However, guidance is limited on generating data with multilevel structures and designing simulations based on such data. This study proposes a general framework for AI-based simulation studies to evaluate the predictive performance and parameter recovery of quantitative methods, specifically using multilevel data commonly observed in the social sciences. Our proposed six-stage workflow consists of (i) specifying a method and real data, (ii) training Generative AI with real data, (iii) assessing synthetic data quality, (iv) designing and conducting simulations, (v) evaluating method performance, and (vi) checking robustness. To enhance fidelity in multilevel data generation, we also introduce targeted modifications to diffusion models and Generative Adversarial Networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.