IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice   Loss for Toxic Spans Detection

Archit Bansal; Abhay Kaushik; Ashutosh Modi

arXiv:2104.01566·cs.CL·April 6, 2021

IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice Loss for Toxic Spans Detection

Archit Bansal, Abhay Kaushik, Ashutosh Modi

PDF

1 Repo

TL;DR

This paper explores semi-supervised learning and Dice Loss to improve toxic span detection in texts, addressing data scarcity and class imbalance, and demonstrates their effectiveness through an ensemble approach.

Contribution

It introduces the use of semi-supervised learning and Self-Adjusting Dice Loss for toxic span detection, a novel combination for this task.

Findings

01

Achieved ninth place in SemEval-2021 Task 5 leaderboard.

02

Ensemble of Transformer models improved detection accuracy.

03

Techniques effectively addressed data scarcity and class imbalance.

Abstract

In this work, we present our approach and findings for SemEval-2021 Task 5 - Toxic Spans Detection. The task's main aim was to identify spans to which a given text's toxicity could be attributed. The task is challenging mainly due to two constraints: the small training dataset and imbalanced class distribution. Our paper investigates two techniques, semi-supervised learning and learning with Self-Adjusting Dice Loss, for tackling these challenges. Our submitted system (ranked ninth on the leader board) consisted of an ensemble of various pre-trained Transformer Language Models trained using either of the above-proposed techniques.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

architb1703/Toxic_Span
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Dice Loss · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Dropout · Attention Is All You Need · Byte Pair Encoding · Residual Connection · Layer Normalization