Identifying Offensive Posts and Targeted Offense from Twitter

Haimin Zhang; Debanjan Mahata; Simra Shahid; Laiba Mehnaz; Sarthak; Anand; Yaman Singla; Rajiv Ratn Shah; Karan Uppal

arXiv:1904.09072·cs.CL·April 22, 2019·1 cites

Identifying Offensive Posts and Targeted Offense from Twitter

Haimin Zhang, Debanjan Mahata, Simra Shahid, Laiba Mehnaz, Sarthak, Anand, Yaman Singla, Rajiv Ratn Shah, Karan Uppal

PDF

Open Access

TL;DR

This paper presents models for detecting offensive tweets and whether they are targeted, achieving top rankings in SemEval 2019, by combining neural networks and heuristic methods.

Contribution

The paper introduces ensemble neural network models for offensive language detection and heuristic-based methods for identifying targeted offensive tweets.

Findings

01

Achieved macro F1 score of 0.807 in offensive tweet detection

02

Ranked 5th out of 103 in Sub-task A

03

Ranked 8th out of 75 in targeted offense detection

Abstract

In this paper we present our approach and the system description for Sub-task A and Sub Task B of SemEval 2019 Task 6: Identifying and Categorizing Offensive Language in Social Media. Sub-task A involves identifying if a given tweet is offensive or not, and Sub Task B involves detecting if an offensive tweet is targeted towards someone (group or an individual). Our models for Sub-task A is based on an ensemble of Convolutional Neural Network, Bidirectional LSTM with attention, and Bidirectional LSTM + Bidirectional GRU, whereas for Sub-task B, we rely on a set of heuristics derived from the training data and manual observation. We provide detailed analysis of the results obtained using the trained models. Our team ranked 5th out of 103 participants in Sub-task A, achieving a macro F1 score of 0.807, and ranked 8th out of 75 participants in Sub Task B achieving a macro F1 of 0.695.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsSigmoid Activation · Tanh Activation · Gated Recurrent Unit · Long Short-Term Memory