Determination of toxic comments and unintended model bias minimization using Deep learning approach
Md Azim Khan

TL;DR
This paper presents a deep learning approach using fine-tuned BERT to detect toxic comments and minimize unintended biases related to identity features, outperforming traditional models in accuracy.
Contribution
It introduces a bias mitigation technique in toxic comment detection by fine-tuning BERT with weighted loss to address data imbalance and bias.
Findings
BERT achieved 89% accuracy in toxic comment classification.
Weighted loss helped reduce identity-related biases.
BERT outperformed Logistic Regression in both accuracy and bias minimization.
Abstract
Online conversations can be toxic and subjected to threats, abuse, or harassment. To identify toxic text comments, several deep learning and machine learning models have been proposed throughout the years. However, recent studies demonstrate that because of the imbalances in the training data, some models are more likely to show unintended biases including gender bias and identity bias. In this research, our aim is to detect toxic comment and reduce the unintended bias concerning identity features such as race, gender, sex, religion by fine-tuning an attention based model called BERT(Bidirectional Encoder Representation from Transformers). We apply weighted loss to address the issue of unbalanced data and compare the performance of a fine-tuned BERT model with a traditional Logistic Regression model in terms of classification and bias minimization. The Logistic Regression model with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dense Connections · Adam · Layer Normalization · Residual Connection · Linear Warmup With Linear Decay · Dropout · Weight Decay
