FinDiff: Diffusion Models for Financial Tabular Data Generation
Timur Sattarov, Marco Schreyer, Damian Borth

TL;DR
FinDiff is a diffusion model that synthesizes high-fidelity, privacy-preserving financial tabular data, enabling regulatory and research applications while outperforming existing models.
Contribution
This paper introduces FinDiff, a novel diffusion-based generative model specifically designed for complex financial tabular data with mixed modalities.
Findings
FinDiff outperforms baseline models in data fidelity.
The model maintains high privacy and utility levels.
Effective across multiple real-world financial datasets.
Abstract
The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both academics and practitioners to conduct collaborative research effectively. The emergence of generative models, particularly diffusion models, capable of synthesizing data mimicking the underlying distributions of real-world data presents a compelling solution. This work introduces 'FinDiff', a diffusion model designed to generate real-world financial tabular data for a variety of regulatory downstream tasks, for example economic scenario modeling, stress tests, and fraud detection. The model uses embedding encodings to model mixed modality financial data, comprising both categorical and numeric attributes. The performance of FinDiff in generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinTech, Crowdfunding, Digital Finance · Stock Market Forecasting Methods
MethodsDiffusion
