Generating Heterogeneous Multi-dimensional Data : A Comparative Study
Michael Corbeau, Emmanuelle Claeys, Mathieu Serrurier, Pascale Zarat\'e

TL;DR
This paper compares various advanced data generation methods to produce realistic synthetic multi-dimensional data for firefighter intervention scenarios, addressing the challenge of unbalanced, non-Gaussian distributions.
Contribution
It evaluates the effectiveness of multiple generative models using domain-specific metrics tailored for firefighting data, highlighting their strengths and limitations.
Findings
Diffusion models better capture complex correlations.
GANs and VAEs show varying success depending on data features.
Evaluation metrics reveal the importance of domain-specific assessment.
Abstract
Allocation of personnel and material resources is highly sensible in the case of firefighter interventions. This allocation relies on simulations to experiment with various scenarios. The main objective of this allocation is the global optimization of the firefighters response. Data generation is then mandatory to study various scenarios In this study, we propose to compare different data generation methods. Methods such as Random Sampling, Tabular Variational Autoencoders, standard Generative Adversarial Networks, Conditional Tabular Generative Adversarial Networks and Diffusion Probabilistic Models are examined to ascertain their efficacy in capturing the intricacies of firefighter interventions. Traditional evaluation metrics often fall short in capturing the nuanced requirements of synthetic datasets for real-world scenarios. To address this gap, an evaluation of synthetic data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
