HLE-UPC at SemEval-2021 Task 5: Multi-Depth DistilBERT for Toxic Spans   Detection

Rafel Palliser-Sans; Albert Rial-Farr\`as

arXiv:2104.00639·cs.CL·August 3, 2021

HLE-UPC at SemEval-2021 Task 5: Multi-Depth DistilBERT for Toxic Spans Detection

Rafel Palliser-Sans, Albert Rial-Farr\`as

PDF

1 Repo

TL;DR

This paper introduces a multi-depth DistilBERT model for detecting toxic spans in text, leveraging embeddings from various layers to improve performance in a complex, subjective task.

Contribution

The paper proposes a novel multi-depth DistilBERT approach that utilizes multiple layer embeddings to enhance toxic span detection accuracy.

Findings

01

Multi-depth embeddings improve model performance.

02

Using multiple layers captures complex toxicity cues.

03

Qualitative analysis confirms model's effectiveness.

Abstract

This paper presents our submission to SemEval-2021 Task 5: Toxic Spans Detection. The purpose of this task is to detect the spans that make a text toxic, which is a complex labour for several reasons. Firstly, because of the intrinsic subjectivity of toxicity, and secondly, due to toxicity not always coming from single words like insults or offends, but sometimes from whole expressions formed by words that may not be toxic individually. Following this idea of focusing on both single words and multi-word expressions, we study the impact of using a multi-depth DistilBERT model, which uses embeddings from different layers to estimate the final per-token toxicity. Our quantitative results show that using information from multiple depths boosts the performance of the model. Finally, we also analyze our best model qualitatively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rafelps/HLE-UPC-SemEval-2021-ToxicSpansDetection
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Dropout · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · WordPiece