Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based   Approach to Offensive Language Identification

Ehsan Doostmohammadi; Hossein Sameti; Ali Saffar

arXiv:2009.10792·cs.CL·September 24, 2020

Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language Identification

Ehsan Doostmohammadi, Hossein Sameti, Ali Saffar

PDF

1 Repo

TL;DR

This paper describes Ghmerti's deep learning models combining word and character features for offensive language detection in social media, achieving competitive results in SemEval-2019.

Contribution

It introduces a novel combination of character-level CNN and word-level RNN models with preprocessing for offensive language identification.

Findings

01

Achieved 77.93% macro F1-score on subtask A

02

Demonstrated effectiveness of combined character and word models

03

Provided insights into social media offensive language detection

Abstract

This paper presents the models submitted by Ghmerti team for subtasks A and B of the OffensEval shared task at SemEval 2019. OffensEval addresses the problem of identifying and categorizing offensive language in social media in three subtasks; whether or not a content is offensive (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). The proposed approach includes character-level Convolutional Neural Network, word-level Recurrent Neural Network, and some preprocessing. The performance achieved by the proposed model for subtask A is 77.93% macro-averaged F1-score.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

edoost/offenseval
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.