New Money: A Systematic Review of Synthetic Data Generation for Finance
James Meldrum, Basem Suleiman, Fethi Rabhi, Muhammad Johan Alibasa

TL;DR
This systematic review analyzes 72 studies on synthetic financial data generation, highlighting dominant GAN methods, applications in market and credit data, and identifying gaps in privacy evaluation to guide future research.
Contribution
It provides a comprehensive synthesis of recent research on generative models for financial data, categorizing methods, applications, and evaluation strategies, and identifying key research gaps.
Findings
GANs dominate the literature for financial data synthesis
Most studies focus on time-series and tabular data
There is a lack of rigorous privacy evaluation
Abstract
Synthetic data generation has emerged as a promising approach to address the challenges of using sensitive financial data in machine learning applications. By leveraging generative models, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), it is possible to create artificial datasets that preserve the statistical properties of real financial records while mitigating privacy risks and regulatory constraints. Despite the rapid growth of this field, a comprehensive synthesis of the current research landscape has been lacking. This systematic review consolidates and analyses 72 studies published since 2018 that focus on synthetic financial data generation. We categorise the types of financial information synthesised, the generative methods employed, and the evaluation strategies used to assess data utility and privacy. The findings indicate that GAN-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
