UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with   BiLSTM-CRF and ToxicBERT Comment Classification

Son T. Luu; Ngan Luu-Thuy Nguyen

arXiv:2104.10100·cs.CL·August 2, 2021

UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and ToxicBERT Comment Classification

Son T. Luu, Ngan Luu-Thuy Nguyen

PDF

1 Repo

TL;DR

This paper describes a system combining BiLSTM-CRF and ToxicBERT for toxic span detection in online comments, achieving a 62.23% F1-score, advancing automated toxicity identification.

Contribution

The paper introduces a hybrid BiLSTM-CRF and ToxicBERT approach specifically designed for toxic span detection in social media posts.

Findings

01

Achieved 62.23% F1-score on the Toxic Spans Detection task.

02

Demonstrated effectiveness of combining sequence labeling with classification models.

03

Provided a new baseline for toxic span detection in SemEval-2021.

Abstract

We present our works on SemEval-2021 Task 5 about Toxic Spans Detection. This task aims to build a model for identifying toxic words in whole posts. We use the BiLSTM-CRF model combining with ToxicBERT Classification to train the detection model for identifying toxic words in posts. Our model achieves 62.23% by F1-score on the Toxic Spans Detection task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sonlam1102/toxic-spans-detection-bilstm_crf
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Softmax · Linear Warmup With Linear Decay · WordPiece · Attention Dropout · Layer Normalization