Toxic Comments Hunter : Score Severity of Toxic Comments

Zhichang Wang; Qipeng Zhu

arXiv:2203.03548·cs.CL·March 8, 2022

Toxic Comments Hunter : Score Severity of Toxic Comments

Zhichang Wang, Qipeng Zhu

PDF

Open Access

TL;DR

This paper presents a system for scoring the severity of toxic comments by collecting datasets, performing data cleaning and feature extraction, and training models using TFIDF and fine-tuned BERT, enabling real-time toxicity scoring.

Contribution

It introduces a comprehensive approach combining traditional and deep learning models for toxic comment severity assessment with real-time implementation.

Findings

01

Effective data cleaning and feature extraction methods for toxic comments

02

BERT-based models outperform traditional TFIDF models in severity scoring

03

Real-time toxicity scoring system successfully deployed

Abstract

The detection and identification of toxic comments are conducive to creating a civilized and harmonious Internet environment. In this experiment, we collected various data sets related to toxic comments. Because of the characteristics of comment data, we perform data cleaning and feature extraction operations on it from different angles to obtain different toxic comment training sets. In terms of model construction, we used the training set to train the models based on TFIDF and finetuned the Bert model separately. Finally, we encapsulated the code into software to score toxic comments in real-time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Attention Dropout · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Dense Connections · Residual Connection