SeismoFlow -- Data augmentation for the class imbalance problem
Ruy Luiz Milidi\'u, Luis Felipe M\"uller

TL;DR
SeismoFlow is a flow-based generative model that creates synthetic seismogram samples to address class imbalance in seismic signal quality classification, improving rare class detection without harming overall accuracy.
Contribution
The paper introduces SeismoFlow, a novel flow-based generative approach inspired by Glow, for generating synthetic samples to mitigate class imbalance in seismic data classification.
Findings
Achieved 13.9% improvement in rare class F1-score.
Generated high-quality, realistic synthetic seismograms.
Enhanced overall accuracy without degrading other class metrics.
Abstract
In several application areas, such as medical diagnosis, spam filtering, fraud detection, and seismic data analysis, it is very usual to find relevant classification tasks where some class occurrences are rare. This is the so called class imbalance problem, which is a challenge in machine learning. In this work, we propose the SeismoFlow a flow-based generative model to create synthetic samples, aiming to address the class imbalance. Inspired by the Glow model, it uses interpolation on the learned latent space to produce synthetic samples for one rare class. We apply our approach to the development of a seismogram signal quality classifier. We introduce a dataset composed of5.223seismograms that are distributed between the good, medium, and bad classes and with their respective frequencies of 66.68%,31.54%, and 1.76%. Our methodology is evaluated on a stratified 10-fold cross-validation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods · Machine Fault Diagnosis Techniques · Drilling and Well Engineering
MethodsNormalizing Flows · Affine Coupling · Activation Normalization · Invertible 1x1 Convolution · GLOW
