A Trustable LSTM-Autoencoder Network for Cyberbullying Detection on   Social Media Using Synthetic Data

Mst Shapna Akter; Hossain Shahriar; Alfredo Cuzzocrea

arXiv:2308.09722·cs.LG·August 22, 2023

A Trustable LSTM-Autoencoder Network for Cyberbullying Detection on Social Media Using Synthetic Data

Mst Shapna Akter, Hossain Shahriar, Alfredo Cuzzocrea

PDF

Open Access

TL;DR

This paper introduces a trustable LSTM-Autoencoder network that effectively detects cyberbullying on social media across multiple languages, utilizing synthetic data to overcome data scarcity and outperforming existing models with 95% accuracy.

Contribution

The paper presents a novel LSTM-Autoencoder model trained on synthetic data, addressing language data scarcity and achieving state-of-the-art cyberbullying detection performance.

Findings

01

The proposed model achieved up to 95% accuracy.

02

It outperformed traditional models like LSTM, BiLSTM, and BERT.

03

Synthetic data helped improve detection across Hindi, Bangla, and English.

Abstract

Social media cyberbullying has a detrimental effect on human life. As online social networking grows daily, the amount of hate speech also increases. Such terrible content can cause depression and actions related to suicide. This paper proposes a trustable LSTM-Autoencoder Network for cyberbullying detection on social media using synthetic data. We have demonstrated a cutting-edge method to address data availability difficulties by producing machine-translated data. However, several languages such as Hindi and Bangla still lack adequate investigations due to a lack of datasets. We carried out experimental identification of aggressive comments on Hindi, Bangla, and English datasets using the proposed model and traditional models, including Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), LSTM-Autoencoder, Word2vec, Bidirectional Encoder Representations from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Softmax · Dense Connections