SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura, Farra, Ritesh Kumar

TL;DR
SemEval-2019 OffensEval was a large-scale shared task that developed a new dataset and evaluated systems on identifying, categorizing, and targeting offensive language in social media posts.
Contribution
Introduction of the OLID dataset and a comprehensive evaluation of systems across three sub-tasks for offensive language detection.
Findings
High participation with 800 teams registered.
Significant variation in system performance.
Insights into challenges of offensive language classification.
Abstract
We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. In sub-task B, the focus was on the type of offensive content in the post. Finally, in sub-task C, systems had to detect the target of the offensive posts. OffensEval attracted a large number of participants and it was one of the most popular tasks in SemEval-2019. In total, about 800 teams signed up to participate in the task, and 115 of them submitted results, which we present and analyze in this report.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Swearing, Euphemism, Multilingualism · Bullying, Victimization, and Aggression
