Fake detection in imbalance dataset by Semi-supervised learning with GAN

Jinus Bordbar; Saman Ardalan; Mohammadreza Mohammadrezaie; Zahra; Ghasemi

arXiv:2212.01071·cs.LG·February 11, 2025

Fake detection in imbalance dataset by Semi-supervised learning with GAN

Jinus Bordbar, Saman Ardalan, Mohammadreza Mohammadrezaie, Zahra, Ghasemi

PDF

Open Access

TL;DR

This paper introduces a semi-supervised GAN-based method utilizing auto-encoders to effectively detect fake social media accounts in imbalanced datasets, achieving high accuracy with limited labeled data.

Contribution

The study applies SGAN with auto-encoders to address imbalanced fake account detection, reducing reliance on extensive labeled data and computational resources.

Findings

01

Achieved 81% accuracy with only 100 labeled samples.

02

Effectively handled class imbalance in fake account detection.

03

Demonstrated efficiency in big data scenarios.

Abstract

As social media continues to grow rapidly, the prevalence of harassment on these platforms has also increased. This has piqued the interest of researchers in the field of fake detection. Social media data, often forms complex graphs with numerous nodes, posing several challenges. These challenges and limitations include dealing with a significant amount of irrelevant features in matrices and addressing issues such as high data dispersion and an imbalanced class distribution within the dataset. To overcome these challenges and limitations, researchers have employed auto-encoders and a combination of semi-supervised learning with a GAN algorithm, referred to as SGAN. Our proposed method utilizes auto-encoders for feature extraction and incorporates SGAN. By leveraging an unlabeled dataset, the unsupervised layer of SGAN compensates for the limited availability of labeled data, making…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Imbalanced Data Classification Techniques · Hate Speech and Cyberbullying Detection

MethodsTest