"To Target or Not to Target": Identification and Analysis of Abusive   Text Using Ensemble of Classifiers

Gaurav Verma; Niyati Chhaya; Vishwa Vinay

arXiv:2006.03256·cs.CL·June 8, 2020·1 cites

"To Target or Not to Target": Identification and Analysis of Abusive Text Using Ensemble of Classifiers

Gaurav Verma, Niyati Chhaya, Vishwa Vinay

PDF

Open Access

TL;DR

This paper introduces an ensemble machine learning approach to detect abusive language on social media, achieving competitive results using only text features and offering insights into linguistic properties of harmful content.

Contribution

It proposes a novel stacked ensemble method that captures diverse linguistic features for abusive text detection without relying on user or network data.

Findings

01

Achieves comparable results to state-of-the-art on Twitter dataset

02

Relies solely on textual properties for classification

03

Provides insights into linguistic features of abusive language

Abstract

With rising concern around abusive and hateful behavior on social media platforms, we present an ensemble learning method to identify and analyze the linguistic properties of such content. Our stacked ensemble comprises of three machine learning models that capture different aspects of language and provide diverse and coherent insights about inappropriate language. The proposed approach provides comparable results to the existing state-of-the-art on the Twitter Abusive Behavior dataset (Founta et al. 2018) without using any user or network-related information; solely relying on textual properties. We believe that the presented insights and discussion of shortcomings of current approaches will highlight potential directions for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Authorship Attribution and Profiling