Toxic Comments Hunter : Score Severity of Toxic Comments
Zhichang Wang, Qipeng Zhu

TL;DR
This paper presents a system for scoring the severity of toxic comments by collecting datasets, performing data cleaning and feature extraction, and training models using TFIDF and fine-tuned BERT, enabling real-time toxicity scoring.
Contribution
It introduces a comprehensive approach combining traditional and deep learning models for toxic comment severity assessment with real-time implementation.
Findings
Effective data cleaning and feature extraction methods for toxic comments
BERT-based models outperform traditional TFIDF models in severity scoring
Real-time toxicity scoring system successfully deployed
Abstract
The detection and identification of toxic comments are conducive to creating a civilized and harmonious Internet environment. In this experiment, we collected various data sets related to toxic comments. Because of the characteristics of comment data, we perform data cleaning and feature extraction operations on it from different angles to obtain different toxic comment training sets. In terms of model construction, we used the training set to train the models based on TFIDF and finetuned the Bert model separately. Finally, we encapsulated the code into software to score toxic comments in real-time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Attention Dropout · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Dense Connections · Residual Connection
