Identifying Offensive Posts and Targeted Offense from Twitter
Haimin Zhang, Debanjan Mahata, Simra Shahid, Laiba Mehnaz, Sarthak, Anand, Yaman Singla, Rajiv Ratn Shah, Karan Uppal

TL;DR
This paper presents models for detecting offensive tweets and whether they are targeted, achieving top rankings in SemEval 2019, by combining neural networks and heuristic methods.
Contribution
The paper introduces ensemble neural network models for offensive language detection and heuristic-based methods for identifying targeted offensive tweets.
Findings
Achieved macro F1 score of 0.807 in offensive tweet detection
Ranked 5th out of 103 in Sub-task A
Ranked 8th out of 75 in targeted offense detection
Abstract
In this paper we present our approach and the system description for Sub-task A and Sub Task B of SemEval 2019 Task 6: Identifying and Categorizing Offensive Language in Social Media. Sub-task A involves identifying if a given tweet is offensive or not, and Sub Task B involves detecting if an offensive tweet is targeted towards someone (group or an individual). Our models for Sub-task A is based on an ensemble of Convolutional Neural Network, Bidirectional LSTM with attention, and Bidirectional LSTM + Bidirectional GRU, whereas for Sub-task B, we rely on a set of heuristics derived from the training data and manual observation. We provide detailed analysis of the results obtained using the trained models. Our team ranked 5th out of 103 participants in Sub-task A, achieving a macro F1 score of 0.807, and ranked 8th out of 75 participants in Sub Task B achieving a macro F1 of 0.695.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsSigmoid Activation · Tanh Activation · Gated Recurrent Unit · Long Short-Term Memory
