Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for   Multiple Toxic Span Extraction from Online Comments

Sreyan Ghosh; Sonal Kumar

arXiv:2105.13959·cs.CL·May 31, 2021

Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments

Sreyan Ghosh, Sonal Kumar

PDF

1 Repo

TL;DR

This paper presents Cisco's system for detecting toxic spans in online comments using transformer-based models, achieving competitive results in the SemEval-2021 Task 5 with a sequence tagging approach.

Contribution

It introduces a novel application of transformers for span-level toxicity detection and compares sequence tagging and dependency parsing methods for this task.

Findings

01

Sequence tagging approach achieved an F1 score of 0.6922.

02

The sequence tagging method outperformed the dependency parsing approach.

03

Cisco's system ranked 7th in the shared task leaderboard.

Abstract

Social network platforms are generally used to share positive, constructive, and insightful content. However, in recent times, people often get exposed to objectionable content like threat, identity attacks, hate speech, insults, obscene texts, offensive remarks or bullying. Existing work on toxic speech detection focuses on binary classification or on differentiating toxic speech among a small set of categories. This paper describes the system proposed by team Cisco for SemEval-2021 Task 5: Toxic Spans Detection, the first shared task focusing on detecting the spans in the text that attribute to its toxicity, in English language. We approach this problem primarily in two ways: a sequence tagging approach and a dependency parsing approach. In our sequence tagging approach we tag each token in a sentence under a particular tagging scheme. Our best performing architecture in this approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Sreyan88/SemEval-2021-Toxic-Spans-Detection
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.