TL;DR
This paper describes Ghmerti's deep learning models combining word and character features for offensive language detection in social media, achieving competitive results in SemEval-2019.
Contribution
It introduces a novel combination of character-level CNN and word-level RNN models with preprocessing for offensive language identification.
Findings
Achieved 77.93% macro F1-score on subtask A
Demonstrated effectiveness of combined character and word models
Provided insights into social media offensive language detection
Abstract
This paper presents the models submitted by Ghmerti team for subtasks A and B of the OffensEval shared task at SemEval 2019. OffensEval addresses the problem of identifying and categorizing offensive language in social media in three subtasks; whether or not a content is offensive (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). The proposed approach includes character-level Convolutional Neural Network, word-level Recurrent Neural Network, and some preprocessing. The performance achieved by the proposed model for subtask A is 77.93% macro-averaged F1-score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
