UTNLP at SemEval-2021 Task 5: A Comparative Analysis of Toxic Span   Detection using Attention-based, Named Entity Recognition, and Ensemble   Models

Alireza Salemi; Nazanin Sabri; Emad Kebriaei; Behnam Bahrak; Azadeh; Shakery

arXiv:2104.04770·cs.CL·April 13, 2021

UTNLP at SemEval-2021 Task 5: A Comparative Analysis of Toxic Span Detection using Attention-based, Named Entity Recognition, and Ensemble Models

Alireza Salemi, Nazanin Sabri, Emad Kebriaei, Behnam Bahrak, Azadeh, Shakery

PDF

1 Repo

TL;DR

This paper compares various models including attention, NER, and ensemble approaches for toxic span detection, aiming to improve interpretability of toxicity models and assist human moderation.

Contribution

It provides a comprehensive analysis of multiple modeling techniques and presents an ensemble approach that achieves competitive performance in toxic span detection.

Findings

01

Ensemble model achieved F1 of 0.684

02

Attention and NER models were evaluated

03

Keyword-based models served as initial baselines

Abstract

Detecting which parts of a sentence contribute to that sentence's toxicity -- rather than providing a sentence-level verdict of hatefulness -- would increase the interpretability of models and allow human moderators to better understand the outputs of the system. This paper presents our team's, UTNLP, methodology and results in the SemEval-2021 shared task 5 on toxic spans detection. We test multiple models and contextual embeddings and report the best setting out of all. The experiments start with keyword-based models and are followed by attention-based, named entity-based, transformers-based, and ensemble models. Our best approach, an ensemble model, achieves an F1 of 0.684 in the competition's evaluation phase.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alirezasalemi7/SemEval2021-Toxic-Spans-Detection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.