GAN based Data Augmentation to Resolve Class Imbalance

Sairamvinay Vijayaraghavan; Terry Guan; Jason (Jinxiao) Song

arXiv:2206.05840·cs.LG·June 14, 2022·1 cites

GAN based Data Augmentation to Resolve Class Imbalance

Sairamvinay Vijayaraghavan, Terry Guan, Jason (Jinxiao) Song

PDF

Open Access

TL;DR

This paper proposes using GANs to generate synthetic minority class data to address class imbalance in credit card fraud detection, improving machine learning model performance.

Contribution

It introduces a GAN-based data augmentation method specifically designed to balance class distribution in fraud detection datasets.

Findings

01

GAN-generated data improves classifier accuracy

02

Synthetic minority data enhances model generalization

03

Method effectively reduces class imbalance issues

Abstract

The number of credit card fraud has been growing as technology grows and people can take advantage of it. Therefore, it is very important to implement a robust and effective method to detect such frauds. The machine learning algorithms are appropriate for these tasks since they try to maximize the accuracy of predictions and hence can be relied upon. However, there is an impending flaw where in machine learning models may not perform well due to the presence of an imbalance across classes distribution within the sample set. So, in many related tasks, the datasets have a very small number of observed fraud cases (sometimes around 1 percent positive fraud instances found). Therefore, this imbalance presence may impact any learning model's behavior by predicting all labels as the majority class, hence allowing no scope for generalization in the predictions made by the model. We trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Vehicle License Plate Recognition · Financial Distress and Bankruptcy Prediction