A BERT-Based Transfer Learning Approach for Hate Speech Detection in   Online Social Media

Marzieh Mozafari; Reza Farahbakhsh; Noel Crespi

arXiv:1910.12574·cs.SI·October 29, 2019

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Marzieh Mozafari, Reza Farahbakhsh, Noel Crespi

PDF

2 Repos

TL;DR

This paper presents a transfer learning approach using BERT to improve hate speech detection in social media, addressing data scarcity and bias issues with promising results on Twitter datasets.

Contribution

Introduces a novel BERT-based transfer learning method with new fine-tuning techniques for hate speech detection in social media content.

Findings

01

Achieves high precision and recall on Twitter hate speech datasets.

02

Effectively captures biases in data annotation and collection.

03

Outperforms existing approaches in hate speech detection.

Abstract

Generated hateful and toxic content by a portion of users in social media is a rising phenomenon that motivated researchers to dedicate substantial efforts to the challenging direction of hateful content identification. We not only need an efficient automatic hate speech detection model based on advanced machine learning and natural language processing, but also a sufficiently large amount of annotated data to train a model. The lack of a sufficient amount of labelled hate speech data, along with the existing biases, has been the main issue in this domain of research. To address these needs, in this study we introduce a novel transfer learning approach based on an existing pre-trained language model called BERT (Bidirectional Encoder Representations from Transformers). More specifically, we investigate the ability of BERT at capturing hateful context within social media content by using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax