Towards Synthetic Multivariate Time Series Generation for Flare Forecasting
Yang Chen, Dustin J. Kempton, Azim Ahmadzadeh, Rafal A. Angryk

TL;DR
This paper demonstrates that using a conditional GAN to generate synthetic multivariate time series can effectively augment data for flare forecasting, significantly improving prediction metrics and addressing data imbalance challenges.
Contribution
It introduces a CGAN-based method for synthetic data generation in multivariate time series, enhancing flare forecasting accuracy and overcoming data scarcity issues.
Findings
Synthetic data improves classifier performance
20-fold increase in TSS metric
5-fold increase in HSS metric
Abstract
One of the limiting factors in training data-driven, rare-event prediction algorithms is the scarcity of the events of interest resulting in an extreme imbalance in the data. There have been many methods introduced in the literature for overcoming this issue; simple data manipulation through undersampling and oversampling, utilizing cost-sensitive learning algorithms, or by generating synthetic data points following the distribution of the existing data. While synthetic data generation has recently received a great deal of attention, there are real challenges involved in doing so for high-dimensional data such as multivariate time series. In this study, we explore the usefulness of the conditional generative adversarial network (CGAN) as a means to perform data-informed oversampling in order to balance a large dataset of multivariate time series. We utilize a flare forecasting benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Time Series Analysis and Forecasting · Market Dynamics and Volatility
