Synthetic Data for Portfolios: A Throw of the Dice Will Never Abolish Chance
Adil Rengim Cetingoz, Charles-Albert Lehalle

TL;DR
This paper examines the limitations of generative models in financial portfolio management, emphasizing theoretical constraints, potential pitfalls, and proposing a new generation pipeline with improved evaluation methods.
Contribution
It provides theoretical insights on sample size effects, highlights inherent issues in generative models for finance, and introduces a new pipeline with enhanced evaluation for portfolio return simulation.
Findings
Theoretical results on initial sample size importance
Identification of pitfalls in generating excessive data
Proposed pipeline for realistic return generation
Abstract
Simulation methods have always been instrumental in finance, and data-driven methods with minimal model specification, commonly referred to as generative models, have attracted increasing attention, especially after the success of deep learning in a broad range of fields. However, the adoption of these models in financial applications has not matched the growing interest, probably due to the unique complexities and challenges of financial markets. This paper contributes to a deeper understanding of the limitations of generative models, particularly in portfolio and risk management. To this end, we begin by presenting theoretical results on the importance of initial sample size, and point out the potential pitfalls of generating far more data than originally available. We then highlight the inseparable nature of model development and the desired uses by touching on a paradox: usual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReflective Practices in Education · Scientific Computing and Data Management
